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Introduction  to  ARL 


The  Army  Research  Laboratory  of  the  U.S.  Army  Research,  Development  and  Engineering  Command  (RDECOM)  is  the  Army’s 
corporate  laboratory.  ARL’s  research  continuum  focuses  on  basic  and  applied  research  (6.1  and  6.2)  and  survivability/lethality 
and  human  factors  analysis  (6.6).  ARL  also  applies  the  extensive  research  and  analysis  tools  developed  in  its  direct  mission 
program  to  support  ongoing  development  and  acquisition  programs  in  the  Army  Research,  Development  and  Engineering 
Centers  (RDECs),  Program  Executive  Offices  (PEOs)/Program  Manager  (PM)  Offices,  and  Industry.  ARL  has  consistently  provided 
the  enabling  technologies  in  many  of  the  Army’s  most  important  weapons  systems. 

The  Soldiers  of  today  and  tomorrow  depend  on  us  to  deliver  the  scientific  discoveries,  technological  advances,  and  the  analyses 
that  provide  Warfighters  with  the  capabilities  to  execute  full-spectrum  operations.  ARL  has  Collaborative  Technology  Alliances  in 
Micro  Autonomous  Systems  Technology,  Robotics,  Cognition  and  Neuroergonomics,  Network  Science,  an  International  Technology 
Alliance  and  new  Collaborative  Research  Alliances  in  Multiscale  Multidisciplinary  Modeling  of  Electronic  Materials  and  Materials 
in  Extreme  Environments.  ARL’s  diverse  assortment  of  unique  facilities  and  dedicated  workforce  of  government  and  private  sector 
partners  make  up  the  largest  source  of  world  class  integrated  research  and  analysis  in  the  Army. 

ARL  Mission 

The  mission  of  ARL  is  to  provide  innovative  science,  technology,  and  analyses  to  enable  full-spectrum  operations. 


Our  Vision 

America’s  Laboratory  for  the  Army:  Many  Minds,  Many  Capabilities,  Single  Focus  on  the  Soldier. 


ARL’s  Organization 

•Army  Research  Office  (ARO)  -  Initiates  the  scientific  and  far  reaching  technological  discoveries  in  extramural  organizations: 
educational  institutions,  nonprofit  organizations,  and  private  industry. 

•Computational  and  Information  Sciences  Directorate  -  Scientific  research  and  technology  focused  on  information  processing, 
network  and  communication  sciences,  information  assurance,  battlespace  environments,  and  advanced  computing  that 
create,  exploit,  and  harvest  innovative  technologies  to  enable  knowledge  superiority  for  the  Warfighter. 

•Human  Research  and  Engineering  Directorate  -  Scientific  research  and  technology  directed  toward  optimizing  Soldier 
performance  and  Soldier-machine  interactions  to  maximize  battlefield  effectiveness  and  to  ensure  that  Soldier  performance 
requirements  are  adequately  considered  in  technology  development  and  system  design. 

•Sensors  and  Electron  Devices  Directorate  -  Scientific  research  and  technology  in  electro-optic  smart  sensors,  multifunction 
radio  frequency  (RF),  autonomous  sensing,  power  and  energy,  and  signature  management  for  reconnaissance,  intelligence, 
surveillance,  target  acquisition  (RISTA),  fire  control,  guidance,  fuzing,  survivability,  mobility,  and  lethality. 

•Survivability/Lethality  Analysis  Directorate  -  Integrated  survivability  and  lethality  analysis  of  Army  systems  and  technologies 
across  the  full  spectrum  of  battlefield  threats  and  environments  as  well  as  analysis  tools,  techniques,  and  methodologies. 

•Vehicle  Technology  Directorate  -  Scientific  research  and  technology  addressing  propulsion,  transmission,  aeromechanics, 
structural  engineering,  and  robotics  technologies  for  both  air  and  ground  vehicles. 


•Weapons  and  Materials  Research  Directorate  -  Scientific  research  and  technology  in  the  areas  of  weapons,  protection,  and 
materials  to  enhance  the  lethality  and  survivability  of  the  Nation’s  ground  forces. 


ARL  Workforce  in  2013 

•  1,980  Civilians  -  38  Military 

•  1080  Contractors  (1027  full-time/53  part-time) 
•1,379  Research  Performing  Workforce 

•  552  (40%)  hold  PhDs 

•  11  STs/23  ARL  Fellows 

ARL’s  Primary  Sites 

•  Aberdeen  Proving  Ground,  MD 

•  Adelphi  Laboratory  Center,  MD 

•  White  Sands  Missile  Range,  NM 

•  Raleigh-Durham,  NC 

•  Orlando,  FL 

Visit  ARL’s  web  site  at  www.arl.army.mil 


Unique  ARL  laboratory  facilities  and  modeling  capabilities  provide  our  scientists  and 
engineers  with  a  world-class  research  environment. 
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FOREWORD 


Thank  you  for  your  interest  in  this  latest  edition  of  Research@ARL.  This 
compendium  of  previously  published  peer-reviewed  journal  articles 
represents  the  best  of  ARL  research  efforts  in  the  topic  area  covered.  Our 
researchers  take  extreme  pride  in  the  quality  of  their  research  that  will 
influence  the  way  the  U.S.  Army  operates  10,  20  and  even  30  years  from 
now.  Their  dedication  to  our  mission  and  the  mission  of  the  U.S.  Army  is 
focused  on  providing  enhanced  capabilities  for  our  Soldiers  of  the  future.  In 
this  edition  of  Research@ARL,  we  take  a  look  at  the  science  and  technology 
of  imaging  and  image  processing. 

For  over  400  years,  optical  instruments  have  been  essential  to  military 
operations.  From  the  first  telescopes  and  binoculars,  imaging  has 
expanded  to  put  remote  cameras  on  satellites  and  on  autonomous 
platforms.  Significantly,  the  U.S.  Army  Research  Laboratory’s  predecessor 
organizations  developed  technologies  in  the  1950s  and  1960s  that  allow 
us  to  see  in  the  dark,  giving  the  U.S.  military  a  strategic  warfighting  advantage.  In  its  2012  Optics  and  Photonics: 
Essential  Technologies  for  Our  Nation,  the  National  Academies  highlighted  the  importance  of  imaging  and  optical 
technologies  to  the  military. 

This  volume  of  Research@ARL  highlights  recent  contributions  to  imaging  science  being  made  by  ARL  researchers. 
These  include  advanced  concepts  in  optical  design,  infrared  technology,  and  image  processing.  Further,  using 
the  power  of  signal  processing,  ARL  researchers  are  altering  the  concept  of  imaging  itself  to  develop  capabilities 
not  previously  possible.  For  example,  by  exploiting  the  quantum  nature  of  light,  ARL  researchers  are  developing 
imagers  that  will  allow  cameras  to  see  around  corners. 

I  hope  you  will  enjoy  perusing  the  articles  contained  in  this  volume,  and  after  doing  so,  I’m  sure  you’ll  appreciate 
the  advances  our  scientists  and  engineers  are  making  to  ensure  our  Army  remains  technologically  superior. 


Dr.  Thomas  P.  Russell 

Director,  U.S.  Army  Research  Laboratory 
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Imaging  and  Image  Processing  Research  at  ARL 
J.  N.  Mait,  G.  Videen,  N.  M.  Nasrabadi,  and  K.-K.  Choi 


1.  Introduction 

Today  cameras  are  ubiquitous.  In  addition  to  capturing  special  moments  with  family  and  friends,  they  monitor  traffic  and  they 
monitor  us  in  public  places.  They  provide  a  full  visual  field  as  we  back  up  our  cars.  We  can  even  swallow  a  pill-sized  camera 
to  image  our  intestinal  tract.  Cameras  are  so  widely  used  that  48  hours  of  video  data  are  uploaded  to  YouTube  every  minute.1 
Through  advancements  in  optics  and  photodetectors,  cameras  are  now  commodities.  Further,  the  proximity  of  cameras  to 
processing  chips  in  smartphone  platforms  is  driving  an  explosion  in  imaging  applications. 

The  U.S.  military  has  been  a  significant  but  unheralded  contributor  to  this  revolution  through  its  development  of  lightweight, 
small-scale  optics,  high-pixel-count  focal-plane  arrays,  and  algorithms  for  pattern  recognition.  Further,  nearly  50  years  ago, 
predecessor  organizations  to  the  U.S.  Army  Research  Laboratory  (ARL)  developed  image  intensifiers  and  infrared  imaging 
systems  to  “own  the  night."  First  deployed  in  small  numbers  to  soldiers  in  Vietnam,  the  prevalence  of  these  technologies  in 
military  units  during  Desert  Storm  gave  the  U.S.  a  strategic  advantage. 

Today,  ARL  researchers  are  building  on  this  heritage  to  alter  the  concept  of  imaging  itself.  For  many  years,  one  created  a 
camera  by  combining  optics  to  form  an  image  with  a  detector  to  capture  it;  more  recently,  one  uses  post-detection  processing 
to  enhance  it.  When  viewed  as  a  whole,  however,  it  is  possible  to  spread  the  process  of  image  formation  across  all  three 
elements— optics,  detectors,  and  signal  processing.  Doing  so  has  allowed  ARL  researchers  to  develop  imaging  capabilities  that 
are  not  possible  under  the  old  paradigm. 

This  volume  presents  some  of  those  capabilities,  as  well  as  other  contributions  made  by  ARL  researchers  to  imaging  and  image 
processing,  and  this  introduction  provides  context  for  ARL’s  research  investment  in  these  areas. 

2.  Army  Applications 

Military  applications  of  imaging  are  varied.  They  include,  for  example,  intelligence  gathering  (collecting  specific  information 
to  support  a  query),  surveillance  (watching  a  particular  region  for  activity),  reconnaissance  (gathering  task-specific  military 
information),  targeting  (labeling  an  object  unequivocally  as  an  object  for  destruction),  and  battle-damage  assessment  (BDA, 
assessingthe  status  of  a  target  after  an  engagement).  Broadly,  they  serve  national  policy,  strategic  policy,  and  tactical  missions. 
At  the  national  level,  imaging  assets  are  used  to  assess  adherence  to  international  treaties.  Critical  strategic  applications 
include  intelligence,  reconnaissance,  and  surveillance  (ISR),  and  critical  tactical  applications  include  targeting  and  BDA. 
Further,  as  imaging  assets  have  become  available  to  field  commanders,  situational  awareness  (a  tactical  understanding  of 
one’s  surroundings)  also  has  become  a  critical  application. 

The  applications  of  imaging  to  military  engagements  were  evident  at  the  beginning  of  imaging  science.  Although  the  microscope, 
invented  in  1590,  is  recognized  as  the  first  optical  instrument,  speculation  exists  that  the  British  defeat  of  the  Spanish  Armada 
in  1588  was  enabled  in  part  by  the  secret  invention  of  the  telescope.2  If  true,  the  lesson  of  the  Spanish  defeat  is  that  seeing 
farther  than  an  adversary  gives  one  the  advantage  of  time.  If  untrue,  the  story  is  at  least  apocryphal.  Two  months  after 
submitting  his  patent  application  for  the  telescope  in  1608,  the  recognized  inventor,  Hans  Lippershey,  submitted  another  for 
binoculars.  To  this  day,  binoculars  remain  a  mainstay  of  tactical  military  units. 

The  invention  of  film  in  1837  removed  the  need  fora  human  observer  and  led  to  the  development  of  the  camera.  Consequently, 
the  first  aerial  photograph  was  taken  in  1858  by  a  photographer  in  a  hot  air  balloon,  and  a  camera  launched  on  a  kite  in 
1882  initiated  the  field  of  remote  imaging.  The  practice  of  loading  cameras  with  film  canisters,  placing  them  on  aerial  and 
high-altitude  platforms,  and  retrieving  the  canisters  after  exposure  was  used  to  great  effect  in  World  War  I  and  continued  into 
the  Cold  War  with  U-2  reconnaissance  aircraft  and  Corona,  the  first  imaging  satellite.  In  1962,  these  platforms  provided  the 
imagery  that  indicated  the  Soviets  were  constructing  missile  launchers  in  Cuba. 

The  invention  of  solid-state  detection  by  Smith  and  Boyle  in  1969  removed  the  need  to  retrieve  film  canisters  and,  in  1976,  the 
National  Reconnaissance  Office  launched  the  KH-11,  the  first  satellite  equipped  with  electronic  imaging.  Such  capabilities  were 
used  in  the  1980s  to  “trust  but  verify"  arms  treaties  with  the  Soviets.  (The  first  personal  digital  camera  was  invented  in  1975 
by  Steve  Sasson  of  Kodak.) 

In  1991,  satellite  imagery  was  used  to  create  target  lists  for  Desert  Storm.  However,  targeting  mobile  SCUD  missile  launchers 
revealed  the  need  to  shorten  the  latency  between  target  detection  and  engagement.  Delays  encountered  in  collecting  and 
analyzing  satellite  imagery  and  distributing  the  results  back  to  field  commanders  drove  the  need  for  persistent  surveillance. 

In  a  reflection  of  the  past,  cameras  attached  to  balloons  hovering  over  cities  have  returned.  The  aerostat-borne  Persistent 
Threat  Detection  System  and  its  aircraft-borne  brethren,  like  Constant  Hawk,  have  reduced  collection  and  distribution  times.  But 
analysis  remains  a  critical  problem.  These  systems  generate  massive  amounts  of  digital  image  data  that  require  considerable 
time  to  process  and  interpret. 
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The  scale  reduction  in  imaging  systems  has  been  a  boon  for  tactical  units.  Small  unmanned  aerial  vehicles  are  being  developed 
to  provide  overhead  imaging  to  company  commanders,  and  camera-equipped  robots  are  providing  squads  with  situational 
awareness  of  buildings,  caves,  and  other  unexplored  terrain  before  a  Soldier  enters.  No  Soldier  should  ever  again  have  to 
experience  the  fear  of  a  “tunnel  rat”  in  Viet  Nam,  armed  only  with  a  flashlight  and  a  45. 

The  imaging  applications  discussed  up  to  this  point  require  a  camera  that  merely  replicates  what  humans  could  see  if  their 
eyes  were  in  the  position  of  the  camera.  However,  a  camera  does  not  function  like  a  human  eye,  which  is  both  a  disadvantage 
and  an  advantage.  For  example,  the  dynamic  range  of  the  human  eye  is  far  greater  than  that  of  a  camera.  Humans  can  see 
in  bright  sun  and  in  moonless  nights;  whereas,  most  cameras  do  not  image  well,  if  at  all,  in  dim  light.  If  they  do,  their  images 
are  saturated  if  just  a  little  light  is  present.  On  the  other  hand,  our  eyes  detect  radiation  only  in  the  visible  portion  of  the 
electromagnetic  spectrum.  Yet  we  can  build  cameras  that  detect  radiation  in  different  wavebands,  such  as  infrared.  Infrared 
cameras  are  useful  for  seeing  warm  objects  at  night,  such  as  humans  and  car  engines. 

In  the  next  section,  we  highlight  contributions  made  by  ARL  researchers  to  address  the  technical  challenges  presented  by 
military  applications  of  imaging. 

3.  ARL  Research 

3.1.  Optics  and  Optical  Design 

The  most  fundamental  principle  underlying  all  imaging  systems  is  that  the  angular  size  of  the  smallest  object  the  system  can 
resolve  is  proportional  to  the  wavelength  of  the  illumination  and  inversely  proportional  to  the  diameter  of  the  input  aperture.  To 
increase  resolution,  one  increases  the  diameter  of  the  optical  system.  However,  this  also  increases  system  volume.  A  system 
capable  of  resolving  objects  half  the  size  of  another  system  nominally  requires  eight  times  the  volume.  Further,  the  field-of-view, 
or  how  much  of  a  scene  an  observer  can  see,  is  a  function  of  the  extent  of  the  image  plane.  An  imager  with  a  detector  that  is 
twice  as  large  as  another  imager  will  have  twice  the  field-of-view,  but  it  will  also  have  four  times  the  volume.  Thus,  designers 
must  balance  optical  performance  and  system  size. 

For  persistent  surveillance,  one  would  like  to  maintain  resolution  across  a  large  field-of-view.  However,  if  one  does  not  change 
the  optics  as  the  field-of-view  increases,  the  quality  of  the  image  at  its  edges  degrades.  Light  arriving  at  a  detector  edge  travels 
a  longer  path  than  light  that  arrives  at  the  detector  center.  For  large  fields-of-view,  the  additional  path  length  produces  a 
distorted,  or  aberrated,  image. 

Milojkovic  and  Mait  investigated  the  trade-off  between  optical  performance  and  physical  size  for  imagers  with  a  large  field- 
of-view  in  “Space-bandwidth  scaling  for  wide  field-of-view  imaging”  (page  13).  They  considered  two  different  types  of  lenses, 
a  conventional  plano-convex  lens  and  a  monocentric  lens— i.e.,  one  in  which  the  front  and  back  surfaces  of  the  lens  have  a 
common  center,  combined  with  two  different  types  of  detectors,  a  conventional  flat  detector  array  and  a  curved  detector  array. 
For  all  cases,  they  quantified  optical  performance  and  physical  characteristics,  such  as  size  and  weight,  as  they  varied  the  lens 
diameter  from  a  few  micrometers  to  a  few  meters.  Their  analysis  indicates  that  a  monocentric  lens  imaging  onto  a  curved 
detector  outperforms  other  systems  for  the  same  requirements  on  optical  performance.  Unfortunately,  a  monocentric  lens 
requires  more  glass  than  a  plano-convex  lens  with  the  same  focal  length  and,  therefore,  weighs  considerably  more.  Milojkovic 
and  Mait  quantified  the  trade-off  between  weight  and  optical  performance,  and  they  also  determined  the  minimum  volume  an 
imager  must  have  to  achieve  a  desired  optical  performance.  Their  results  allow  optical  designers  to  balance  resources  against 
performance  when  designing  imaging  systems  for  persistent  surveillance. 

The  relationship  between  aperture  diameter  and  wavelength  holds  at  all  wavelengths.  For  the  same  aperture  diameter,  a  longer 
wavelength  implies  worse  resolution.  This  is  the  case  for  infrared  (IR)  radiation,  whose  wavelength  is  approximately  10  times 
longer  than  visible  radiation.  However,  IR  radiation  conveys  different  information  about  a  scene  than  visible  radiation,  which 
can  be  advantageous. 

In  the  visible  spectrum,  people  and  objects  reveal  their  physical  characteristics  only  when  illuminated  by  a  source,  such  as  the 
sun  or  indoor  lighting.  Measurements  made  in  the  IR  spectrum  reveal  the  temperature  of  an  object.  That  is,  a  vehicle  imaged 
using  an  IR  system  will  look  dark  if  it  has  not  been  used  for  hours  and  bright  if  its  engine  has  just  been  turned  off.  Similarly,  a 
patch  of  soil  produces  a  different  IR  image  when  it  is  in  direct  sunlight  versus  when  it  is  in  shade. 

In  “Remote  detection  of  buried  land-mines  and  lEDs  using  LWIR  polarimetric  imaging,”  (page  27)  Gurton  and  Felton  exploit  the 
properties  of  IR  imaging  to  distinguish  between  disturbed  and  undisturbed  soil,  which  can  potentially  improve  the  detection 
of  buried  explosives.  Gurton  and  Felton  demonstrated  experimentally  that  IR  imaging  can  reveal  the  differences  in  physical 
structure  between  disturbed  and  undisturbed  soil. 

Instead  of  measuring  just  the  intensity  of  IR  radiation  reflected  off  a  patch  of  soil,  Gurton  and  Felton  also  measured  the 
polarization  of  the  radiation.  Polarization  is  a  property  of  an  optical  field  that  indicates  the  orientation  of  its  oscillations. 
Sunlight,  for  example,  has  no  preferred  polarization.  But  sunlight  reflected  off  a  surface  does. 
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Through  a  series  of  field  experiments,  Gurton  and  Felton  showed  that  IR  radiation  from  disturbed  soil  is  more  polarized  than 
undisturbed  soil.  They  also  identified  two  sources  of  the  polarization.  The  IR  radiation  given  off  by  warm,  disturbed  soil  is 
strongly  polarized.  Additionally,  upon  reflection,  the  disturbed  soil  polarizes  the  sun’s  IR  radiation.  Further,  Gurton  and  Felton 
showed  that  the  difference  in  the  degree  of  polarization  between  disturbed  and  undisturbed  soil  is  sufficiently  large  that  one 
can  reliably  distinguish  between  the  two.  When  combined  with  other  sensor  modalities,  such  as  radar,  IR  polarimetric  imaging 
could  enhance  our  ability  to  detect  landmines,  lEDs,  and  other  buried  threats. 

At  even  longer  wavelengths,  beyond  IR,  radiation  phenomenology  changes  again.  The  region  between  optical  and  radio 
frequencies  is  the  terahertz  (THz)  and  gigahertz  (GHz)  regime,  where  wavelengths  range  between  300  pm  and  3  mm.  In 
this  frequency  band,  it  is  possible  to  image  through  obscurants  such  as  dust,  smoke,  fog,  and  even  textiles.  Thus,  millimeter 
wave  (mmW)  imaging,  as  it  is  sometimes  referred,  offers  potential  solutions  for  helicopter  pilots  operating  in  degraded  visual 
environments  and  for  security  personnel  monitoring  people  for  body-borne  explosives  at  checkpoints  or  urban  sites. 

For  the  latter  application,  as  in  any  persistence  surveillance  application,  one  desires  a  large  field-of-view.  However,  with 
wavelengths  four  orders  of  magnitude  longer  than  visible,  this  is  physically  not  possible.  Thus,  a  large  field-of-view  is  achieved 
by  scanning  the  imager  across  a  scene.  Presently,  scanning  is  achieved  mechanically.  However,  a  more  efficient  method 
is  to  alter  the  radiation  phase  within  the  aperture  of  the  imager  so  that  its  limited  field-of-view  is  scanned  across  a  scene 
electronically.  In  “Design  of  220  GHz  electronically  scanned  reflectarrays  for  confocal  imaging  systems,"  (page  45)  Hedden, 
Dietlein,  and  Wikner  considered  the  performance  of  an  integrated  reflectarray  to  provide  electronic  scanning.  They  examined 
the  tradeoffs  between  reflectarray  size,  system  size,  and  the  number  of  resolvable  image  pixels.  Consequently,  they  designed 
an  imager  that  operates  at  220  GHz  ( X  =  1.36  mm)  with  8.3  cm  resolution  at  50  m.  This  resolution  allows  one  to  discern  the 
barrel  of  a  pistol  or  blade  of  a  short  knife  at  50  m.  The  system  aperture  is  1  m  and  system  length  is  0.23  m,  which  provides  the 
requisite  resolution,  a  30-degree  full  field-of-view,  and  still  allows  the  system  to  be  portable.  The  reflectarray  is  5.4  cm  x  5.4 
cm,  with  78  x  78  phase  shifting  elements  spaced  a  half-wavelength  apart. 

Hedden,  Dietlein,  and  Wikner  also  characterized  the  impact  on  the  quality  of  an  imaged  point  if  one  uses  a  1-bit  reflectarray— 
i.e.,  one  that  can  realize  only  two  phase  shifts,  0  and  n,  in  comparison  to  a  2-bit  reflectarray,  one  that  can  realize  four  phase 
shifts,  0,  ti/2,  7i,  and  37i/2.  In  simulations,  the  1-bit  reflectarray  generated  significantly  more  noise  and  errors  than  the 
2-bit  reflectarray  and  did  not  meet  the  requirements  for  imaging.  Their  design  and  analysis  provide  circuit  designers  with 
requirements  for  reflectarray  performance  and  systems  designers  with  an  architecture  for  future  systems  to  detect  body-borne 
devices  in  cluttered  urban  environments. 

3.2.  Detection 

Given  the  large  commercial  market,  development  of  detectors  for  visible  imaging  is  primarily  the  domain  of  industry.  However, 
the  military  continues  to  dominate  the  development  of  IR  detectors.  By  observing  the  natural  radiation  given  off  by  an  object, 
the  military  can  detect  and  track  targets  at  a  great  distance  without  relying  on  a  light  source.  For  the  same  reason,  it  is 
also  a  more  reliable  way  to  achieve  situational  awareness.  The  many  uses  of  IR  imaging  include,  for  example,  night  vision, 
large  area  surveillance  and  reconnaissance,  helicopter  piloting  in  degraded  visual  environments,  detecting  and  countering 
unmanned  aerial  systems,  gun-sights  for  armored  vehicles  and  dismounted  soldiers,  covert  search  and  rescue,  and  decoy 
countermeasures. 

To  ensure  military  dominance  on  the  battlefield,  the  Army  seeks  to  maintain  its  superiority  in  infrared  detection.  To  achieve  this 
goal,  the  focal  plane  arrays  (FPAs)  the  Army  deploys  must  excel  in  all  areas  of  performance  including  thermal  sensitivity,  image 
resolution,  speed  of  detection,  pixel  uniformity  and  operability,  system  reliability  and  robustness,  operation  readiness,  and 
simplicity.  To  enable  large-scale  deployment,  the  technology  must  also  be  manufacturable  and  affordable. 

As  the  corporate  research  laboratory  of  the  Army,  ARL  is  exploring  new  frontier  science  and  technology  to  revolutionize  the 
Army’s  IR  capabilities.  To  ensure  no  opportunities  are  overlooked,  the  Army  engages  in  a  wide  range  of  research,  from  the 
most  conventional  to  the  most  exotic  infrared  materials.  Currently,  mercury  cadmium  telluride  (HgCdTe)  is  the  most  sensitive  IR 
material  among  competing  materials.  However,  HgCdTe  detector  arrays  are  difficult  to  produce  because  the  substrates  needed 
to  grow  the  material  are  not  suitably  large.  HgCdTe  is  traditionally  grown  on  bulk  cadmium  zinc  telluride  (CZT)  substrates,  which 
are  lattice-matched  to  HgCdTe.  However,  CZT  substrates  are  available  only  in  relatively  small  sizes.  Further,  the  difference  in 
thermal  expansion  coefficients  between  a  CZT  substrate  and  its  silicon  (Si)  read-out  integrated  circuitry  reduces  the  reliability 
of  large  format  FPAs  due  to  repeated  thermal  cycling. 

Some  in  the  community  believed  this  problem  could  be  overcome  by  growing  HgCdTe  on  composite  substrates  consisting  of 
cadmium  selenide  and  telluride,  and  silicon  (Cd(Se)Te/Si).  They  also  believed  this  approach  could  potentially  provide  a  route  to 
affordable,  robust  third-generation  FPAs.  However,  due  to  the  lattice  mismatch  between  Cd(Se)Te  and  Si,  this  approach  leads 
to  high  dislocation  densities  (greater  than  mid  x  106  cm-2),  which  degrades  performance. 

Alternatively,  one  can  change  the  IR  material,  and  researchers  at  ARL  have  considered  mercury  cadmium  selenide  (HgCdSe) 
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as  an  alternative  to  HgCdTe.  With  a  lower  variance  in  lattice  constant,  HgCdSe  yields  better  lattice-matching  to  alternate 
substrates  and  offers  better  multi-color  performance. 

Additionally,  HgCdSe  offers  flexibility  in  the  choice  of  substrate.  With  a  lattice  constant  near  0.61  nm,  HgCdSe  is  well-suited 
to  be  grown  on  either  gallium  antimonide  (GaSb)  or  zinc  telluride  (ZnTe)  substrates.  GaSb  is  available  as  a  bulk  substrate  with 
dislocation  densities  ~104  cm-2,  and  ZnTe  can  be  alloyed  with  zinc  selenide  (ZnSe)  to  form  lattice-matched  ZnTe1  xSex.  There 
have  been  attempts  to  produce  good  quality  material  in  the  past,  but  ARL  has  produced  the  highest  quality  material  reported. 

The  paper  “Mercury  cadmium  selenide  for  infrared  detection”  by  Doyle  et  al.  (page  57)  represents  a  significant  step  in  reducing 
the  background  electron  concentration  of  HgCdSe  to  a  level  suitable  for  fabricating  devices.  As  described  in  the  paper,  by 
switching  to  high  purity  6N  Se  source  material,  Doyle  and  his  co-authors  reduced  the  background  electron  concentration  of 
HgCdSe  samples  by  over  an  order  of  magnitude,  from  concentrations  greater  than  1017  cm-3  to  the  mid-1016  cm-3.  They  reduced 
this  further  by  annealing  under  Se.  However,  before  HgCdSe  devices  are  developed,  this  background  concentration  must  be 
reduced  by  at  least  another  order  of  magnitude.  Current  work  is  moving  towards  reducing  the  carrier  concentration  to  develop 
IR  detectors. 

For  less  conventional  IR  technologies,  quantum  well  infrared  photodetector  (QWIP)  technology  is  a  promising  candidate  except 
for  one  major  weakness.  Made  from  one  of  the  highest  quality  semiconductor  materials  besides  silicon,  gallium  arsenide  (GaAs) 
QWIPs  could  easily  fulfill  all  the  Army’s  IR  requirements.  Unfortunately,  this  material  requires  an  unusual  detection  scheme.  To 
produce  an  electrical  signal,  incident  light  must  travel  sideways— i.e.,  parallel  to  the  material  layers.  When  IR  light  shines  on 
the  detector  surface,  as  in  most  other  detector  technologies,  no  light  is  detected.  The  standard  solution  to  this  problem  is  to 
place  a  diffraction  grating  on  top  of  individual  detectors  to  disperse  incoming  light  into  different  angles,  thereby  altering  the 
propagation  direction  of  the  light.  A  portion  of  the  light  travels  at  a  large  angle  and  is  detected.  However,  this  approach  has  a 
quantum  efficiency  (QE)  of  only  5%.  With  95%  of  the  incident  light  undetected,  QWIPs  cannot  provide  the  military  with  the  IR 
sensitivity  and  imager  speed  it  needs.  Presently,  25  years  after  its  invention,  QWIP  technology  has  not  improved  significantly, 
and  it  has  long  been  deemed  to  be  a  low  QE  technology. 

Nonetheless,  ARL  is  determined  to  increase  the  QE  of  a  QWIP  and  recent  work  shows  great  promise  to  increase  it  significantly. 
ARL’s  approach  includes  developing  a  highly  accurate  electromagnetic  (EM)  model  to  calculate  the  EM  field  inside  a  complex 
detector  geometry,  a  capability  that  the  infrared  community  heretofore  did  not  recognize  or  pursue.  After  considerable  effort,  as 
reported  in  “Electromagnetic  Modeling  and  Design  of  Quantum  Well  Infrared  Photodetectors,”  Choi  et  al.  (page  63)  succeeded 
in  establishing  a  finite-element  EM  model  that  one  can  apply  to  any  arbitrary  three-dimensional  detector  geometry.  This  model 
enabled  the  researchers  to  explain,  for  the  first  time,  all  the  previously  unexplained  open  literature  experimental  data,  and  for 
the  first  time  to  predict  a  definitive  QE  from  any  QWIP  design.  Aided  by  this  EM  model,  the  authors  advanced  a  new  detector 
concept,  referred  to  as  the  resonator-QWIP  or  R-QWIP.  The  R-QWIP  utilizes  the  detector  volume  as  a  resonant  cavity  to  diffract, 
capture,  and  store  the  otherwise  unabsorbed  light  until  it  is  eventually  absorbed  by  the  detector.  Choi  predicted  a  QE  for 
the  R-QWIP  as  large  as  75%.  Subsequently,  the  authors  tested  the  R-QWIP  concept  on  five  different  detector  materials  and 
observed  QEs  ranging  from  15  to  71%.  One  of  the  more  modest  QE  materials  was  fabricated  into  imaging  arrays  and  yielded 
a  QE  of  30%,  all  in  accordance  with  the  model  predictions.  Even  with  this  modest  QE,  thermal  sensitivity— i.e.,  the  detector’s 
lowest  measureable  temperature  change  is  already  15  mK  when  it  is  operated  at  a  2.4  ms  integration  time.  These  metrics  are 
many  times  better  than  the  20  mK  operated  at  20  ms  in  standard  QWIP  cameras,  proving  the  potential  of  the  new  detector 
concept.  Imaging  arrays  with  higher  QEs  are  under  production. 

Additionally,  the  developed  EM  model  allows  designers  to  have  much  greater  control  of  the  detector’s  optical  properties.  In 
the  near  future,  different  R-QWIPs  will  be  produced  to  suit  a  wide  range  of  applications— for  example,  narrow  band  imaging 
through  dust  clouds,  narrow  band  detection  of  chemical  gases,  simultaneous  two-color  detection  for  infrared  search  and  track, 
sequential  two-color  detection  for  target  detection  and  identification,  broadband  detection  for  hyperspectral  imaging,  and 
circular  polarization  detection  for  biological  imaging. 

In  addition  to  advancing  QWIP  technology,  using  the  resonant  storage  of  light  to  enhance  absorption  is  also  applicable  to  other 
infrared  materials  and  other  optical  devices,  such  as  solar  cells.  The  paper  by  Choi  et  al.  shows  how  ARL’s  basic  and  applied 
research  can  make  an  ineffective  infrared  technology  useful  to  the  Army  and  how  this  research  can  have  an  even  broader 
impact  on  other  scientific  and  technological  areas. 

To  further  strengthen  the  Army’s  IR  capability,  ARL  also  works  on  IR  detection  by  altering  and  manipulating  a  material’s  optical 
properties  in  this  wavelength  regime.  The  paper  “Passive  infrared  sensing  using  plasmonic  resonant  dust  particles”  by  Mirotznik, 
et  al.  (page  75)  reported  a  new  way  to  control  the  IR  emission  spectrum  from  surfaces  and  particles.  Most  objects,  either 
manmade  or  found  in  nature,  reflect  and  emit  IR  radiation  in  a  relatively  smooth  and  broad  spectrum;  however,  by  applying 
structures  with  resonant  absorption  to  the  surface  of  those  materials,  the  reflection  and  emission  spectra  can  be  enhanced 
or  reduced  at  particular  wavelengths.  Moreover,  by  mixing  small  resonant  particles  (<100  pm)  designed  for  several  different 
wavelengths,  one  can  form  an  IR  dust  that  reflects  or  emits  with  a  characteristic  spectral  signature.  Such  material-by-design 
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particles  would  be  useful  for  a  variety  of  practical  applications.  For  example,  when  applied  to  a  base  surface,  the  resonant 
particles  could  be  used  to  tune  IR  reflectance  to  mimic  other  natural  or  manmade  surfaces. 

The  resonant  particles  and  surface  treatments  are  of  particular  interest  for  the  Army.  Potential  applications  include  atmospheric 
sensing  of  chemical  agents,  calibration  and  training  aids  for  hyperspectral  imaging  systems,  and  creation  of  custom  infrared 
spectral  signatures  for  passive  friend-or-foe  identification.  For  example,  in  the  training-aid  application,  Soldiers  could  learn  to 
use  hyperspectral  equipment  to  identify  the  spectral  signature  of  dangerous  chemical  agents  by  observing  an  assortment  of 
safe  plasmonic  samples  that  are  designed  to  match  the  spectral  signatures  of  the  dangerous  agents. 

The  paper  presents  computational  and  experimental  results  for  particles  that  can  be  tuned  to  preferentially  reflect  or  emit  IR 
radiation  within  the  8-14  pm  infrared  band.  The  particles  consist  of  thin  metallic  subwavelength  gratings  patterned  on  the 
surface  of  a  simple  quarter  wavelength  cavity.  This  design  creates  distinct  IR  absorption  resonances  by  combining  the  plasmonic 
resonance  of  the  grating  with  the  natural  resonance  of  the  cavity.  The  resonance  peaks  are  easily  tuned  by  varying  either  the 
geometry  of  the  grating  or  the  thickness  of  the  cavity.  Measurements  of  reflection  and  emission  from  fabricated  particles  agreed 
with  predicted  performance.  The  tested  particles  use  a  one-dimensional  grating  that  works  for  one  polarization  of  incident  light, 
but  the  paper  also  shows  that  a  two-dimensional  “fish  net"  grating  should  yield  high-contrast  spectral  features  for  both  incident 
polarizations.  The  author’s  next  step  is  to  design  and  fabricate  surfaces  or  particles  for  practical  Army  applications  so  that  the 
new  structures  can  be  demonstrated  and  tested  under  field  conditions. 

3.3.  Post-Detection  Processing 

ARL  researchers  have  been  at  the  forefront  of  research  in  image  processing  and  scene  analysis.  ARL  actively  conducts  research 
on  a  large  number  of  topics,  such  as  automatic  target  recognition,  multimodal  sensor  fusion,  personnel  detection,  super¬ 
resolution,  face  recognition,  object  tracking  from  video  sequences,  and  the  use  of  biometrics  for  human  identification.  The 
following  discussion  addresses  novel  image  processing  techniques  for  four  different  applications:  superresolution  for  video 
face  recognition,  multimodal  sensor  fusion,  automatic  target  recognition  in  FUR  imagery,  and  target  detection  in  hyperspectral 
imagery  are  presented. 

Video  imagery  has  become  the  most  common  and  versatile  form  of  media  for  capturing,  analyzing,  and  disseminating  a  variety 
of  information.  Video  surveillance  systems  have  led  to  their  widespread  use  on  commercial  properties  and  for  residential 
monitoring.  One  major  application  of  cheap,  low-resolution  video  surveillance  cameras  is  face  recognition,  which  is  crucial  to 
aiding  the  law  enforcement  community  and  homeland  security  in  identifying  suspects  and  suspicious  individuals  on  watch  lists. 
Flowever,  face  recognition  performance  is  severely  affected  by  the  low  resolution  of  individuals  in  typical  surveillance  footage, 
often  due  to  the  long  distance  between  individuals  and  cameras,  as  well  as  the  small  pixel  count  of  low-cost  surveillance 
systems.  Fortunately,  super-resolution  algorithms  have  the  potential  to  improve  face  recognition  performance  by  using  a 
sequence  of  low-resolution  images  of  an  individual’s  face  in  the  same  pose  to  reconstruct  a  more  detailed  high-resolution  facial 
image. 

In  “Face  Recognition  Performance  with  Super-resolution,’’  (page  85)  Flu,  Maschal,  and  Young  from  ARL,  in  collaboration 
with  Flong  and  Phillips  from  NIST,  developed  a  super-resolution  algorithm  for  face  recognition,  and  conducted  an  extensive 
performance  evaluation  using  a  methodology  and  experimental  setup  consistent  with  real  world  settings,  including  multiple 
subject-to-camera  distances. 

Using  the  same  low-resolution  camera,  facial  images  were  obtained  at  far  (~13  m),  mid  (~9  m),  and  close  (~5  m)  range.  At 
the  ranges,  the  face  resolutions  in  terms  of  eye-to-eye  distances  were  5-10,  15-20,  and  25-30  pixels,  respectively.  Flu  et  al. 
doubled  the  effective  resolution  of  the  system  using  digital  processing  using  a  sequence  of  eight  low  resolution  images.  They 
then  submitted  the  super-resolved  images  to  a  state-of-the-art  face  recognition  algorithm. 

For  recognition  of  faces  at  9  m  and,  assuming  a  fixed  5%  false  alarm  rate— i.e.,  one  out  of  20  times  the  system  makes  a  false 
identification— the  use  of  super-resolved  images  improved  the  rate  of  correct  identification  from  31%  using  original  images  to 
45%.  Their  results  show  that  super-resolution  image  reconstruction  can  improve  face  recognition  performance  considerably  at 
the  examined  midrange  and  close  range. 

Forward-looking  IR  (FUR)  cameras  provide  the  U.S.  Army  with  the  capability  to  see  through  darkness  to  detect  and  track  objects 
of  interest,  such  as  humans  and  vehicles.  One  of  the  major  military  applications  of  FUR  imaging  sensors  is  automatic  target 
recognition  (ATR),  which  seeks  to  detect  and  recognize  objects  of  interest  (targets)  in  an  environment  full  of  clutter  and  imaged 
by  an  imperfect  sensor,  which  introduces  noise  into  the  resulting  signal. 

An  ATR  system  consists  of  several  stages.  In  the  first,  the  system  scans  an  entire  image  to  identify  regions  of  interest  within 
which  a  potential  target  is  detected.  In  the  second  stage,  background  clutter  is  removed.  (Roughly,  anything  that  is  not 
considered  part  of  the  target  is  considered  clutter.)  In  the  third  stage,  the  system  computes  a  set  of  features  that  are  used  in 
the  fourth  stage  to  classify  the  target— e.g.,  bus,  sedan,  or  tank,  or  even  more  specifically,  an  Ml  tank  versus  a  T-72  tank. 
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In  “Sparsity-motivated  automatic  target  recognition/’  (page  97)  Nasrabadi  of  ARL,  in  collaboration  with  Patel  and  Chellappa 
from  the  University  of  Maryland,  developed  a  new  classifier  for  long-wave  IR  imagery.  The  new  classifier  is  based  on  the  concept 
of  using  a  dictionary  of  target  templates  and  the  theories  of  compressive  sensing  (CS)  and  sparse  representation  to  reduce  the 
amount  of  data  required  for  high  confidence  classification.  Their  idea  is  to  create  a  dictionary  matrix  of  training  samples  from 
all  classes  of  targets  and  represent  the  test  sample  as  a  few  selected  (sparse)  linear  combinations  of  the  dictionary  column 
vectors.  This  sparse  representation  is  then  used  to  infer  the  target  type  (class)  of  the  input  sample.  They  demonstrated  that  the 
performance  of  the  proposed  classifier  is  significantly  better  than  classical  classifiers,  as  well  as  previously  developed  classifiers 
from  the  ARL. 

They  also  investigated  the  use  of  compressive  sensing  technique  to  reduce  the  dimensions  of  both  the  test  samples  and  the 
training  samples  in  the  dictionary.  They  demonstrated  that  by  reducing  the  dimensionality  from  the  original  target  templates  40 
x  65  pixels  to  only  256  incoherent  measurements  (features),  the  decrease  in  the  classifier  performance  is  insignificant.  When 
only  64  features  are  used  the  classifier  performance  is  still  78%;  however,  when  a  mere  16  features  are  used;  the  classifier 
performance  degrades  drastically  to  43%. 

As  mentioned,  IR  imaging  systems  provide  capabilities  that  visible  cameras  cannot,  such  as  seeing  through  darkness,  shadows, 
fog,  clouds,  rain,  snow,  and  smoke.  They  are,  however,  subject  to  a  number  of  inherent  limitations,  such  as  low  resolution  (in 
comparison  to  visible),  the  loss  of  non-thermal  but  important  visual  features  (such  as  color  and  texture),  and,  under  certain 
combinations  of  ambient  and  target  temperatures,  yield  low  thermal  contrast  between  targets  and  background.  Given  that 
visible  cameras  are  relatively  low-cost,  easy  to  use,  and  capable  of  producing  high-quality  imagery  under  favorable  conditions, 
researchers  have  considered  exploiting  the  advantages  of  cameras  in  each  spectral  band  to  improve  target  detection. 

In  “Fusing  concurrent  visible  and  infrared  videos  for  improved  tracking  performance,”  (page  107)  ARL  researchers  Chen  and 
Schnelle  studied  the  usefulness  of  fusing  visible  color  and  long  wave  IR  imageries  to  improve  the  detection  and  tracking  of 
moving  targets.  Although  a  given  sensor  may  be  easily  fooled  sometimes,  it  is  much  harder  to  trick  a  number  of  sensors 
simultaneously  at  any  given  time.  Consequently,  Chen  and  Schnelle  investigated  several  pixel-based  image  fusion  algorithms 
using  image  pyramids  generated  by  the  Laplacian,  contrast,  gradient,  morphological,  and  several  variations  of  the  Discrete 
Wavelet  Transform  (DWT)  methods.  Pixel-based  methods  perform  fusion  at  the  lowest  level  of  image  representation,  the  pixel. 
They  do  not  require  high-level  abstract  information  and  are,  therefore,  the  simplest  to  implement.  Chen  and  Schnelle  performed 
digital  detection  and  tracking  on  the  fused  images,  and  compared  the  performance  across  the  fusion  algorithms  against  the 
performance  of  just  using  visible  imagery  and  just  using  IR  imagery.  Their  results  indicate  that,  in  comparison  to  using  just  IR 
imagery,  detecting  and  tracking  performance  degraded  for  fusion  algorithms  based  on  combining  visible  pixels  with  IR  pixels. 
Performance  was  mixed  for  several  pyramid-based  algorithms,  which  use  physical  scale  as  a  basis  for  representation,  but  were 
deemed  inferior  due  to  their  high  computational  cost.  Fusion  algorithms  based  on  the  DWT  provided  improved  performance 
with  the  lowest  computational  costs  of  all  pyramidal  methods.  By  exploiting  the  complementary  strengths  of  visible  and  IR 
imagery,  Chen  and  Schnelle  demonstrated  that  fusion  algorithms  based  on  DWT  image  pyramids  are  capable  of  improving 
target  detection  and  tracking. 

Improved  target  detection  based  on  the  fusion  of  visible  and  IR  imagery  leads  naturally  for  one  to  consider  the  advantages 
of  using  multiple  wavebands.  This  is  the  motivation  behind  hyperspectral  imaging.  Hyperspectral  cameras  collect  and 
process  information  across  the  electromagnetic  spectrum  by  dividing  a  region  of  the  spectrum  into  many  narrow  spectral 
bands,  ranging  typically  from  50  to  possibly  400  spectral  bands.  Hyperspectral  sensors  look  at  objects  using  a  vast  portion 
of  the  electromagnetic  spectrum  well  beyond  the  visible  range.  Since  different  materials  have  unique  spectral  signatures, 
hyperspectral  sensors  are  useful  in  agriculture  and  mineralogy  to  distinguish  between  crops  and  soil  types.  However,  one  also 
can  use  spectral  signatures  to  locate  materials  that  pertain  to  a  particular  target.  Thus,  hyperspectral  imagery,  which  combines 
spectroscopy  and  imaging,  is  useful  for  object  detection  and  classification. 

A  byproduct  of  hyperspectral  imaging  is  the  large  amount  of  data  that  results.  How  one  processes  this  data  efficiently  and 
effectively  is  considered  by  Nasrabadi  from  ARL,  in  collaboration  with  Chen  and  Tran  from  Johns  Hopkins  University,  in  “Sparse 
Representation  for  Target  Detection  in  Hyperspectral  Imagery”  (page  121).  Nasrabadi  et  al.  developed  a  novel  sparsity-based 
target  detection  algorithm  to  locate  military  targets  in  hyperspectral  imagery  in  desert  and  forest  environments.  Their  proposed 
algorithm  uses  known  target  and  background  signatures  from  a  spectral  library  to  construct  a  composite  dictionary  consisting  of 
target  and  background  sub-dictionaries.  Then  a  hyperspectral  test  pixel  is  reconstructed  approximately  using  very  few  training 
samples  (i.e.,  sparse  representation)  from  both  target  and  background  sub-dictionaries  after  imposing  a  sparsity  constraint  on 
the  reconstruction.  The  recovered  sparse  representation  is  used  directly  to  detect  the  presence  or  absence  of  a  target  in  the 
hyperspectral  test  pixel.  For  targets  that  consist  of  multiple  pixels,  a  smoothing  constraint  called  the  joint  sparsity  model  is 
enforced  in  the  reconstruction  process  to  incorporate  the  assumption  that  neighboring  spatial  pixels  consist  of  similar  materials. 
By  incorporating  this  contextual  information  directly  into  the  classifier  through  the  joint  sparsity  model,  it  is  possible  to  enforce 
the  classifier  to  make  a  joint  decision  on  all  the  neighboring  spatial  pixels  and  improve  the  target  detection  performance.  This 
also  avoids  the  usual  post-processing  fusion  of  the  classifier  outputs  on  the  neighboring  pixels. 
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3.4.  Computational  Imaging 

Image  formation  discussed  up  to  this  point  uses  optics  to  form  an  image,  uses  a  detector  to  convert  photons  to  electrons, 

and  uses  electronic  processing  to  enhance  information  in  the  image.  However,  if  one  redesigns  the  front-end  optics  in  an 

unconventional  way  and,  concurrently,  redesigns  the  post-detection  processing,  it  may  be  possible  to  generate  the  desired 

information  in  a  simpler  manner,  or  it  may  be  possible  to  generate  information  that  would  otherwise  be  costly  to  produce. 

Linking  optical  design  and  post-detection  processing  in  this  way  is  referred  to  as  computational  imaging,  a  field  in  which  ARL  is 
a  leader. 

Although  the  field  of  computational  imaging  grew  out  of  advances  in  electronic  detection  and  processing,  holography,  when 
viewed  in  retrospect,  is  one  of  the  first  computational  techniques  developed  to  improve  resolution.  As  originally  proposed,  the 
post-detection  processing  was  performed  optically. 

Developed  in  1948  and  made  popular  after  the  invention  of  the  laser  in  1960,  holography  interferes  a  reference  beam  coherently 
with  a  beam  reflected  from  an  object.  The  resulting  interference  pattern  is  the  hologram.  When  the  hologram  is  illuminated  by 
the  reference  beam,  the  hologram  produces  an  image  of  the  object.  With  the  advent  of  electronic  detection  and  processing,  the 
interference  pattern  is  now  recorded  on  a  FPA  and  the  image  is  reconstructed  digitally.  This  is  referred  to  as  digital  holography. 

In  “Digital  holographic  imaging  of  aerosol  particles  in  flight,”  (page  135)  Berg  and  Videen  apply  digital  holography  to  characterize 
aerosol  particles.  Aerosol  particle  characterization  has  been  a  research  priority  in  monitoring  pollutants  for  many  decades.  In 
the  atmospheric-sciences  community,  such  characterization  has  overlapped  strongly  with  atmospheric  dynamics  and  chemistry. 
More  recently,  biological  aerosols  have  been  recognized  as  a  health  threat  within  the  medical  and  security  communities.  While 
counting  and  sizing  strategies  have  been  around  for  many  years,  rapidly  attaining  other  properties  of  aerosols  has  proven 
elusive.  In  addition  to  accuracy,  the  basic  requirements  are  speed,  low  cost,  and  automation.  Traditional  imaging  techniques 
are  hampered  by  the  depth-of-field  required  to  produce  a  sharp  image,  as  the  uncertainties  of  the  aerosol  position  within  a  flow 
generally  are  well  beyond  the  depth  of  focus.  Elastic  light  scattering  also  has  been  pursued,  but  retrieving  information  from  the 
scattered  field  has  proven  difficult. 

Berg  and  Videen  overcome  these  problems  by  interfering  the  scattered  light  from  the  aerosol  with  a  reference  beam  to  form 
a  digital  hologram.  By  applying  the  Fresnel-Kirchhoff  approximation  in  different  focal  planes,  Berg  and  Videen  overcame  the 
focusing  problem  and  reconstructed  images  from  different  depths.  Morphological  information  pertaining  to  the  aerosols  can 
be  retrieved  directly  from  the  reconstructed  images.  By  applying  holographic  techniques  to  image  aerosols  in  a  flow,  Berg  and 
Videen  provided  a  new  tool  to  the  aerosol  community. 

The  depth-of-field  problem— i.e.,  the  depth  of  a  region  over  which  an  imaging  system  is  considered  in  focus— is  inherent  to  all 
optical  systems.  As  discussed  in  Sec.  3.1,  millimeter-wave  technology  allows  one  to  scan  individuals  for  body-borne  explosives. 
In  controlled  situations,  such  as  an  airport,  authorities  can  scan  individuals  in  a  portal.  In  a  dynamic  urban  setting,  such  control 
may  not  be  possible,  yet  one  would  still  like  to  scan  individuals  as  they  pass  through  a  volume.  However,  the  1-m  aperture  of 
the  system  designed  by  Hedden  et  al.  has  an  extremely  narrow  depth-of-field.  In  “94-GHz  Imager  with  Extended  Depth  of  Field,” 
(page  145)  Mait  et  al.  used  computational  imaging  techniques  to  extend  the  region  over  which  a  millimeter  wave  image  remains 
in  focus.  Their  approach  is  to  aberrate  the  system  in  a  known,  controlled  fashion  (in  their  case,  by  using  an  optical  element  that 
has  a  cubic  phase)  and  to  perform  simple  post-detection  processing.  It  is  not  possible  to  extend  the  depth-of-field  by  five  times, 
as  Mait  and  Wikner  did,  using  conventional  means  without  incurring  considerable  cost.  The  computational  approach  requires 
only  one  additional  optical  element  and  unique  post-detection  processing.  However,  the  computational  approach  requires 
that  one  alter  their  notion  of  the  function  of  optics  in  an  imaging  system  given  the  availability  and  capability  of  post-detection 
processing. 

Whereas  the  cubic  phase  approach  is  a  fixed  solution  to  a  problem,  the  ability  to  sense  an  environment  and  adapt  the  optics 
based  on  those  measurements  to  improve  system  performance  is  another  aspect  of  computational  imaging.  In  “Experimental 
demonstration  of  coherent  beam  combining  over  a  7  km  propagation  path,”  (page  153)  Weyrauch  et  al.  apply  adaptive 
techniques  for  imaging  horizontally.  Unlike  remote  sensing,  where  cameras  peer  down  through  the  atmosphere,  tactical 
imaging  is  horizontal,  i.e.,  parallel  to  the  surface  of  the  earth.  Under  these  conditions,  atmospheric  turbulence  limits  system 
resolution  and  fidelity. 

Weyrauch  et  al.  demonstrated  that  they  can  control  phase  variations  in  the  optical  path  with  sufficient  precision  to  combine 
coherently  seven  laser  beams  emerging  from  an  adaptive  fiber-collimator  array  over  a  7  km  atmospheric  propagation  path.  This 
is  a  significant  achievement,  as  this  is  not  only  more  than  an  order-of-magnitude  greater  distance  than  previous  experiments, 
but  also  extends  the  range  into  that  of  practical  working  distances  of  energy  transmission  through  the  atmosphere.  To  perform 
this  feat,  the  wavefront  phase  at  each  fiber-collimator  subaperture  was  controlled  by  its  internal  micro-fiber  positioner.  The 
output  beams  were  combined  coherently  and  focused  onto  their  target.  The  phase  locking  and  control  of  the  wavefront  were 
achieved  by  maximizing  the  target-return  optical  power  using  stochastic  parallel  gradient  descent  (SPGD)  techniques. 
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This  technique  offers  a  lighter  and  more  efficient  system  in  resolving  the  complex  atmospheric  turbulence  problem  and  improves 
the  resolution  of  the  system.  Direct  sensing  from  the  target  and  beam  control  through  the  SPGD  mechanism  eliminated  an 
external  bulky  and  costly  wavefront-detection  system.  The  coherent  beam-combining  technique  delivers  kHz-rated  phase¬ 
locking  compensation  and  the  maximum  power  to  the  target.  The  technology  directly  supports  the  Army’s  needs  in  developing 
tactical  and  long-distance  sensing,  imaging,  communication,  and  directed  energy  systems. 

High-resolution  imaging  of  distant  objects  has  many  military  applications.  While  lasers  are  used  for  long-distance  illumination 
through  the  atmosphere,  they  create  image  speckle  due  to  coherent  phase  aberrations.  Atmospheric  turbulence  can  cause 
these  speckle  spots  to  wander  on  the  target,  which  can  limit  resolution. 

In  “Turbulence-free  ghost  imaging,"  (page  157)  Meyers,  Deacon,  and  Shih  present  a  fundamentally  new  approach  to  meet 
this  challenge.  Turbulence-free  ghost  imaging  is  a  computational  imaging  technique  that  can  reconstruct  an  object  that  is  not 
in  a  conventional  sense  imaged  by  the  system.  Through  imaging  experiments  performed  at  ARL,  Meyers  et  al.  suggest  ghost¬ 
imaging  can  be  performed  free  of  the  adverse  effects  of  turbulence. 

Meyers,  Deacon,  and  Shih  used  two-photon  interference  and  the  superposition  of  the  quantum  properties  of  light  to  image 
through  turbulence.  While  a  single-pixel  light  sensor  sensed  the  total  light  reflected  from  an  object,  a  second  sensor  camera 
imaged  the  photons  coming  just  from  the  laser  light  source.  The  coincident  measurements  were  combined  computationally, 
creating  the  ghost  image  of  the  target. 

Ghost  imaging  shares  some  attributes  with  conventional  holography.  In  conventional  holography,  signal  and  reference  beams 
interfere  as  coherent  waves  to  form  a  pattern  that  generates  an  image.  In  ghost  imaging,  signal  and  reference  beams  combine 
to  form  an  image  based  on  the  correlation  between  their  quantum  properties.  To  function  properly,  though,  ghost  imaging 
requires  fast,  single-photon-sensitive  cameras. 

The  work  by  Meyers,  Deacon,  and  Shih  provides  a  unique,  fundamental  contribution  to  advanced  imaging  techniques.  Ghost 
imaging  is  different  from  conventional  imaging  in  that  the  illuminating  light  is  imaged,  as  opposed  to  the  object.  Quantum 
two-photon  interference  provides  improved  resolution  to  the  images  as  the  aberrations  caused  by  atmospheric  turbulence  are 
cancelled  out  in  the  quantum  process. 

4.  The  Future  of  Imaging 

As  the  previous  discussion  indicates,  advances  in  detectors  and  electronic  processing  have  expanded  the  capabilities  of  optical 
sensing  beyond  replicating  the  appearance  of  an  object  and  the  computational  imaging  examples  are  only  just  the  beginning. 
Faster  detectors,  such  as  the  ones  used  for  ghost  imaging,  and  smaller  cameras  will  allow  optical  designers  to  exploit  both  time 
and  space  to  create  even  more  capabilities.  For  example,  wafer-scale  cameras  enable  the  development  of  small,  multi-aperture 
cameras,  which  has  allowed  designers  to  create  very  thin  cameras,3  cameras  with  high  dynamic  range,4  and  cameras  sensitive 
to  polarization.4 

Extremely  fast  detectors  enable  fast  cameras  and  have  enabled  femtography.5  In  a  single  femtosecond  (1015  seconds), 
light  propagates  a  few  millimeters.  Using  a  detector  array  capable  of  femtosecond  exposure  times,  it  is  possible  to  track  the 
propagation  of  optical  beams  and,  consequently,  to  label  rays.  Using  these  labels,  it  is  possible  to  keep  track  of  rays  as  they  go 
out  of  sight  and  return  after  being  reflected  by  an  object  out  of  the  camera’s  line  of  sight.  In  this  way  it  is  possible  to  perform 
non-line-of-sight  imaging,  to  see  around  corners,  without  relying  upon  the  quantum  nature  of  light.  Such  capabilities  will  no 
doubt  be  of  extreme  value  to  the  military.  The  future  of  imaging  remains  very  bright. 
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We  examine  the  space-bandwidth  product  of  wide  field-of-view  imaging  systems  as  the  systems  scale  in 
size.  Our  analysis  is  based  on  one  conducted  to  examine  the  behavior  of  a  plano-convex  lens  imaging  onto 
a  flat  focal  geometry.  We  extend  this  to  consider  systems  with  monocentric  lenses  and  curved  focal  geo¬ 
metries.  As  a  means  to  understand  system  cost,  and  not  just  performance,  we  also  assess  the  volume  and 
mass  associated  with  these  systems.  Our  analysis  indicates  monocentric  lenses  imaging  onto  a  curved 
detector  outperform  other  systems  for  the  same  design  constraints  but  do  so  at  a  cost  in  lens  weight. 

OCIS  codes:  110.0110,  220.4830,  220.3620. 


1.  Introduction 

The  proliferation  of  imaging  assets  for  security  and  de¬ 
fense  has  generated  a  demand  for  high  resolution  ima¬ 
ging  across  an  ever  increasing  field-of-view  (FOV). 
The  off-the-shelf  engineering  solution  to  this  problem 
is  to  tile  the  desired  FOV  with  numerous  high  resolu¬ 
tion  cameras.  This  approach,  though  simple  to  imple¬ 
ment,  uses  limited  resources  inefficiently,  such  as 
volume  and  weight.  A  more  inspired  approach  is  to  re¬ 
design  the  optics.  However,  designers  very  quickly 
realize  how  difficult  it  is  to  maintain  high  resolution 
as  the  FOV  increases.  Using  simple  arguments  re¬ 
flected  in  Fig.  1,  Lohmann  showed  that  performance 
is  limited  primarily  by  optical  aberrations  [1] . 

Lohmann  and  his  colleagues  used  space-bandwidth 
as  a  metric  for  imaging  performance  in  a  second,  more 
analytic,  study  and  examined  the  limits  on  space- 
bandwidth  product  (SBWP)  as  a  function  of  FOV 
[2] .  Their  analysis  of  a  simple  plano-convex  lens  ima¬ 
ging  onto  a  flat  detector  provided  a  rough  indication  of 
the  physical  scales  when  geometric  aberrations,  as  op¬ 
posed  to  diffraction,  dominate  imaging  performance. 

To  provide  high  resolution  over  a  wide  FOV,  re¬ 
searchers  have  proposed  alternate  lenses  and  alter¬ 
nate  detector  geometries  to  overcome  aberrations 
[3,4,5,6,7,8,9,10].  For  example,  a  curved  detector  re¬ 
moves  geometric  image  distortions  introduced  at  the 
edge  of  the  FOV  when  imaging  onto  a  flat  detector  [4] 
and  fabrication  of  such  detectors  is  an  ongoing  re¬ 
search  topic  [5, 6, 7, 8].  Further,  by  its  nature,  a  mono¬ 


centric  system,  i.e.,  one  in  which  the  front  and  back 
surfaces  of  the  lens  have  a  common  center,  re¬ 
duces  the  aberrations  imposed  on  large  off-axis  rays. 
Monocentric  triplets  are  typically  used  to  correct 
chromatic  aberrations  in  eyepieces  [11] .  Although  in¬ 
terest  in  monocentric  systems  has  recently  increased 
[9,10],  the  advantages  of  such  systems  have  a  long 
history,  as  evidenced  by  the  Sutton  panoramic 
water  lens  patented  in  1859  and  the  Baker  ball  lens 
from  1942  [3].  The  aberrations  can  be  reduced  even 
further  if  the  refractive  index  of  the  lens  is 
graded  [12]. 

In  particular,  [4]  presents  a  quantitative  analysis 
of  the  performance  of  three  different  imaging  sys¬ 
tems  at  a  single  scale:  a  plano-convex  lens  imaging 
onto  a  flat  focal  geometry,  a  Cooke  triplet  imaging 
onto  a  flat  focal  geometry,  and  a  ball  lens  imaging 
onto  a  curved  focal  geometry.  The  linear  scale  of  each 
system  is  approximately  10  mm,  and,  in  addition  to 
comparing  on-  and  off-axis  point  spread  functions, 
the  analysis  examines  chromatic  behavior  for  three 
visible  wavelengths.  The  results  indicate  that  a  ball 
lens  imaging  onto  a  curved  focal  plane  provides  the 
best  overall  performance. 

In  our  work,  we  explore  how  changes  in  optical  de¬ 
sign  affect  imaging  performance  as  a  function  of 
system  size.  To  do  so,  we  base  our  work  on  that  of 
Lohmann’s  and  extend  it  to  account  for  changes  in 
lenses  and  in  detector  geometries.  Using  our  analy¬ 
sis,  we  examine  the  impact  on  system  size  and  weight 
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as  performance  demands  increase  and  we  highlight 
general  trends.  We  do  so,  however,  for  only  a  single 
wavelength.  We  do  not  consider  chromatic  behavior. 
Further,  our  approach  can  be  used  by  others  as  a 
framework  for  generating  quantitative  data  when 
necessary. 

In  Section  2  we  present  the  metrics  Lohmann  used 
to  characterize  optical  performance  and,  in  Section  3, 
we  describe  our  method  for  imaging  analysis.  We 
present  our  data  and  provide  a  discussion  of  it  in 
Section  4  and  conclude  in  Section  5  with  additional 
discussion  and  summary  remarks. 


Fig.  1.  Space-bandwidth  of  an  optical  system  as  a  function  of 
scale.  Reproduced  from  [1]. 


2.  Space-Bandwidth  Analysis 

In  his  analysis,  Lohmann  used  the  space-bandwidth 
product  S  as  a  measure  of  image  quality.  The  space- 
bandwidth  S  is  the  number  of  resolvable  points  in  an 
image  plane, 

S  =  —,  (1) 

ares 

where  A  is  the  image  plane  area  and  ares  is  the  area 
of  a  single  resolvable  spot.  One  can  approximate  the 
resolution  spot  size  ares  as  the  sum  of  contributions 
from  diffraction  and  aberration  [13], 

ares  =  (<5*)2  +  m2  =  Of  ID)2  +  I2,  (2) 

where  A  is  the  wavelength  of  illumination,  f  and  D 
are  the  lens  focal  length  and  diameter,  respectively, 
and  £  is  a  measure  of  lateral  geometric  aberrations. 
The  term  £2  is  its  variance.  To  increase  space- 
bandwidth,  one  can  increase  the  size  of  the  image 
plane,  i.e.,  increase  the  FOV,  reduce  the  size  of  the 
resolvable  spot,  or  both.  However,  resolution  area 
and  FOV  are  linked  to  the  image  system  and  cannot 
be  controlled  independent  of  one  another. 

In  [1],  Lohmann  assumed  a  constant  FOV  and 
considered  only  the  impact  on  space-bandwidth  as 
a  function  of  the  resolution  spot  size.  In  this  case, 
if  one  considers  only  diffraction,  an  increase  in  lens 
diameter  reduces  the  size  of  a  resolvable  spot,  which, 
in  turn,  increases  the  space-bandwidth.  However,  be¬ 
cause  increasing  the  diameter  of  a  lens  also  increases 
the  impact  of  aberrations,  the  relationship  between 
the  size  of  a  lens  and  its  space-bandwidth  is  more 
complex  than  that  specified  simply  by  diffraction. 

Lohmann  used  Eqs.  (1)  and  (2)  to  generate  the 
heuristic  curves  represented  in  Fig.  1,  which  indicate 
the  relationship  between  the  scale  of  an  imaging  sys¬ 
tem  and  its  space-bandwidth.  Figure  2  indicates  how 
the  imaging  system  is  scaled  by  a  factor  M,  while  the 
/’-number, 

U=f/D,  (3) 

is  held  constant. 

In  the  absence  of  aberrations,  S  will  increase  with¬ 
out  bound  as  the  imaging  system  increases  in  size.  In 
contrast,  because  aberrations  scale  with  the  size  of 


the  imaging  system,  in  the  absence  of  diffraction, 
there  would  be  no  improvement  in  S  as  the  system 
increases.  Thus,  the  actual  performance  of  an  ima¬ 
ging  system  is  a  combination  of  these  behaviors: 
increasing  with  scale  in  regions  where  diffraction 
dominates  and  constant  in  regions  where  aberra¬ 
tions  dominate. 

The  last  curve  generated  by  Lohmann  indicates 
how  designers  compensate  for  the  limitations  im¬ 
posed  by  aberrations.  To  increase  S  as  system  scale 
increases,  the  f  -number  of  the  system  must  necessa¬ 
rily  also  increase.  In  a  second,  coauthored  publication, 
Lohmann  explored  this  link  between  f  -number  and 
space-bandwidth,  and  considered  also  the  FOV  of 
the  system  [2].  It  is  this  publication  that  provides 
the  starting  point  for  our  analysis. 

With  reference  to  Fig.  3,  Lohmann  and  his  collea¬ 
gues  considered  the  space-bandwidth  properties  of 
the  simple  plano-convex  lens  represented  in  the 
upper  left.  In  comparison  to  the  analysis  in  [1],  which 
considered  fixed  f -number  lenses  and  a  single  scaling 
factor  M,  the  analysis  in  [2]  considered  variable  f- 
number  lenses.  In  addition,  it  considered  the  FOV  of 
the  imaging  system.  As  represented  in  the  figure,  we 
extend  Lohmann’s  analysis  to  include  monocentric 
lenses  in  addition  to  plano-convex  as  well  as  two  dif¬ 
ferent  focal  plane  geometries,  flat,  and  curved.  We 
also  consider  the  special  case  of  a  Luneburg  lens  with 
a  curved  detector  on  its  surface.  A  Luneburg  lens 
is  monocentric  with  a  variable  refractive  index, 


-/- 

(a) 


A ID 


(b) 


Fig.  2.  Imaging  system  scaling  considered  by  Lohmann.  (a)  Base 
system,  (b)  Scaled  system. 
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Fig.  3.  Representative  imaging  systems  considered  for  analysis. 


n{r)  =  [2  -  (r/i?)2]1/2,  that  images  collimated  light 
perfectly  on  its  surface  [12]. 

A.  Space-Bandwidth  as  a  Function  of  Focal  Plane 
Geometry 

It  is  important  to  consider  the  geometry  of  the  focal 
plane  since  it  affects  the  numerator  of  Eq.  (1)  and  the 
size  of  the  detector  sets  the  FOV.  With  reference  to 
Fig.  4,  the  image  area  A  as  a  function  of  the  half- 
angle  p  for  a  flat  focal  plane  represented  is 

Aflat  =  (2/-#JDtan/?)2,  (4) 

and  for  a  curved  focal  plane, 

Acurv  =  27r(f#D)\l-cosp).  (5) 

In  the  absence  of  aberrations,  the  size  of  a  resolva¬ 
ble  spot  for  a  flat  detector  is 

(  A  ^ 

^res,flat  (  3  0  I  1  (^) 

\D  cos a  p ) 


The  cosine-cube  scaling  of  the  conventional  diffrac¬ 
tion  limited  spot  on  a  flat  detector  results  from  the 
oblique  incidence  of  the  illumination  on  the  aperture, 
the  increased  distance  rays  have  to  travel  (relative  to 
on-axis),  and  the  oblique  incidence  of  the  illumina¬ 
tion  on  the  detector.  With  a  curved  detector,  the  latter 
two  factors  are  removed  and  only  the  cosine  scaling 
due  to  the  oblique  incidence  of  the  illumination  on 
the  aperture  remains.  As  a  point  of  reference,  if  this 
cosine  scaling  is  absent,  the  resolvable  spot  follows 
from  the  conventional  definition, 


This  assumes  a  uniform  beam  illuminates  the  lens 
aperture.  Also,  in  essence,  we  approximate  the  diam¬ 
eter  of  the  spot  by  its  full- width  at  half-maximum. 

Thus,  Eq.  (1)  in  combination  with  Eqs.  (4)-(7),  yields 
closed-form  expressions  for  the  space-bandwidth 
given  the  two  detector  geometries, 

Sflat  =  4(D/i)2(l  -  cos2/?)cos4/?,  (9) 

SCurv  —  2/r(Z)//l)2(l  -  cos  /?)cos2 p.  (10) 

These  functions  are  represented  in  Fig.  5.  The  exis¬ 
tence  of  a  maximum  is  due  to  the  fact  that,  for  small 
angles,  the  rate  at  which  the  spot  size  grows  is  slower 
than  the  rate  at  which  the  detector  area  increases.  For 
large  angles,  this  rate  behavior  reverses.  The  angle  at 
which  this  behavior  switches  is  the  one  that  yields 
maximum  space-bandwidth.  For  a  flat  detector,  this 
angle  is  p  =  35.2°  and  for  a  curved  detector,  p  = 
48.2°.  Not  only  is  the  angle  that  maximizes  space- 
bandwidth  larger  for  a  curved  detector  than  a  flat 
one,  the  value  of  the  space-bandwidth  is  1.5  times  lar¬ 
ger.  This  provides  some  indication  of  the  advantages  of 
a  curved  detector  over  a  flat  one. 

The  ideal  behavior  noted  in  Fig.  5  is  derived  as¬ 
suming  no  aberrations  and  no  geometrical  scaling. 


and  for  a  curved  detector 


Fig.  4.  Geometry  for  rays  assuming  a  (a)  flat  and  (b)  curved  focal 
plane. 


Fig.  5.  Space-bandwidth  product  S  as  a  function  of  half-field 
angle  /?  for  flat  and  curved  detectors. 
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Thus,  as  the  area  of  a  curved  detector  increases,  the 
number  of  resolvable  spots  increases  without  limits. 
The  normalized  curve  shown  follows  1  -  cos  /?. 

However,  this  analysis  is  valid  only  in  the  absence 
of  aberrations.  How  far  the  space-bandwidth  devi¬ 
ates  from  the  values  in  Eqs.  (9)  and  (10)  is  a  measure 
of  the  severity  of  aberrations.  To  assess  this,  we  need 
to  analyze  the  optical  performance  of  the  lenses  in 
combination  with  these  detectors.  In  the  next  sec¬ 
tion,  we  describe  the  procedure  we  used  to  do  so. 

3.  Lens  Analysis 

In  this  section  we  present  the  results  of  our  optical 
simulations  to  assess  the  impact  of  both  diffraction 
and  geometrical  aberrations  on  space-bandwidth. 
The  lenses  we  analyzed  (plano-convex,  monocentric, 
and  Luneburg)  are  represented  in  Fig.  6. 

Our  analysis  followed  as  closely  as  possible  the  ap¬ 
proach  in  [2].  We  analyzed  performance  at  a  single 
wavelength  A  =  500  nm,  i.e.,  we  did  not  consider 
chromatic  behavior,  and,  except  for  the  Luneburg 
lens,  assumed  n  =  1.5.  (We  selected  real  glass  mate¬ 
rials  that  met  this  criteria.)  We  varied  the  lens  diam¬ 
eter,  D ,  from  50  //m  to  5  and  m  the  /’-number,  /*#,  from 
1  to  1000. 

For  each  lens  and  detector  geometry,  we  varied  the 
angle  of  the  incident  beam  over  the  FOV  and  deter¬ 
mined  its  corresponding  spot  size  on  the  detector.  To 
do  so,  we  launched  a  large  number  of  rays  from  a  par¬ 
ticular  field  point,  including  the  chief  ray,  into  the  en¬ 
trance  pupil  of  the  lens,  and  traced  them  through  the 
optics  to  the  image  surface.  We  calculated  the  spot 
size  as  the  root-mean-square  of  the  differential  dis¬ 
tance  between  the  location  of  the  chief  ray  and  the 
locations  of  the  other  rays.  We  also  calculated  the  dif¬ 
fraction  spot  size  from  Airy  disk. 

We  inserted  these  values  into  Eq.  (2)  to  determine 
S  and,  in  accordance  with  [2],  assumed  the  value  of 
resolution  spot  size  was  valid  across  the  entire  detec¬ 
tor  plane.  We  designated  the  maximum  value  of 
space-bandwidth  Smax  and  the  angle  at  which  this 
maximum  was  achieved,  /?max. 

The  details  pertaining  to  each  lens  and  detector 
geometry  are  described  in  Appendix  A.  It  is  impor¬ 
tant  to  note  that,  again  in  accordance  with  the 
approach  used  in  [2],  we  did  not  consider  all  potential 
means  to  improve  lens  performance.  For  example,  we 
did  not  use  any  aspherical  surfaces. 


Fig.  7.  / ?max  as  a  function  of  f#  for  variable  lens  diameters  for  a 
plano-convex  lens  imaging  onto  a  flat  detector.  Reproduced  from 
[2]. 


One  example  of  the  results  presented  in  [2]  is  re¬ 
produced  in  Fig.  7.  Our  results  are  presented  in 
Figs.  8-12.  We  feel  the  consistency  between  Figs.  7 
and  8(a)  validates  our  approach  and  gives  us  confi¬ 
dence  in  drawing  conclusions  from  the  data  for  differ¬ 
ent  systems,  which  we  present  in  Section  4. 

In  addition  to  Fig.  7,  we  compared  other  data  from 
[2]  to  ours.  Our  data  were  consistent  with  [2]  in  all 
cases,  e.g.,  spot  size  as  a  function  of  angle.  However, 
we  felt  that  including  all  comparisons  would  have 
overwhelmed  the  reader  and  added  little  to  the  dis¬ 
cussion.  We  present  the  space-bandwidth  data  for  its 
relevance  to  this  discussion.  To  aid  the  comparison 
between  our  work  and  [2],  Fig.  8  contains  data  for 
an  additional  lens  with  D  =  5  //m  that  the  other  lens 
systems  do  not.  ZEMAX  was  unable  to  converge  con¬ 
sistently  to  a  design  for  all  of  systems  with  such  a 
small  diameter. 

We  note  the  special  case  of  the  Luneburg  lens  in 
Fig.  12.  The  Luneburg  lens,  in  effect,  provides  ideal 
imaging  because  the  area  of  the  smallest  resolvable 
spot  is  independent  of  the  angle  of  incidence.  The 
space-bandwidth  is  maximum  at  80°  because  that 
was  the  upper  limit  of  angles  we  considered.  (Beyond 
80°  ZEMAX  was  unable  to  generate  consistently 
stable  data).  Based  on  our  data,  the  space-bandwidth 
of  the  Luneburg  lens  as  a  function  of  angle  /?  is 


Fig.  6.  (Color  online)  Lenses  analyzed,  (a)  Plano-convex,  (b)  Monocentric,  (c)  Luneburg. 
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Fig.  8.  (Color  online)  Analysis  of  a  plano-convex  lens  imaging  onto  a  flat  detector,  (a)  /?max  and  (b)  *Smax  as  a  function  of  f#  for  variable  lens 
diameters. 


Slu neburg  =  1.34(D/A)2(1  -  COS  fi).  (11) 

The  compact  presentation  of  data  presented  in 
Figs.  8-12  belies  the  effort  required  to  generate  it. 
Data  collection  for  a  single  combination  of  lens  type 
and  detector  requires  the  analysis  of  elements  with 
six  different  diameters  and,  for  each  diameter,  16  f- 
numbers,  for  a  total  of  96  lenses.  Since  we  analyzed 
31  field  points  for  plano-convex  lenses  and  41  field 
points  for  monocentric  lenses,  we  therefore  deter¬ 
mined  2976  spot  size  data  values  for  a  single  detector 
geometry  with  a  plano-convex  lens  and  3731  values 
for  a  single  detector  geometry  with  a  monocentric 
lens.  Not  including  the  analysis  of  the  Luneburg  lens, 
this  means  we  generated  and  analyzed  over  13,000 
optical  spots  to  generate  the  graphs  presented  in  this 
section. 

Needless  to  say,  our  data  collection  was  not  per¬ 
formed  manually.  Instead,  we  implemented  an  auto¬ 
mated  method  of  data  collection  using  a  ZEMAX 
extension  file  written  in  C++  to  drive  ZEMAX 
externally.  Because  of  the  unstable  nature  of  the 


(a) 


Fig.  9.  (Color  online)  Analysis  of  a  plano-convex  lens  imaging  onto 
lens  diameters. 


optimization  algorithm  in  the  programming  envi¬ 
ronment  required  for  this  task,  development  of  the 
automation  method  took  a  considerable  amount  of 
time  to  test  and  validate. 

4.  Data  Interpretation 

In  this  section,  we  attempt  to  draw  conclusions  from 
our  results.  We  need  to  be  cautious  when  making 
comparisons  between  systems  because  each  was  op¬ 
timized  using  slightly  different  criteria.  Nonetheless, 
we  feel  the  trends  displayed  are  consistent  and  allow 
us  to  provide  some  explanation  for  the  behavior  ex¬ 
hibited.  For  example,  the  improvement  in  maximum 
space-bandwidth  in  Fig.  9(b)  over  Fig.  8(b)  is  due 
most  likely  to  the  close  match  between  the  curved 
detector  and  the  Petzval  surface  shape. 

Figures  8  and  10  reaffirm  the  conclusions  from  [2] . 
Only  for  physically  small  lenses  and  lenses  with 
large  /‘-numbers  are  aberrations  sufficiently  small 
that  diffraction  dominates  their  performance.  Small 
/’-number  lenses  and  large  diameter  lenses  are  most 
affected  by  aberrations. 


(b) 

curved  detector,  (a)  /?max  and  (b)  *Smax  as  a  function  of  /*#  for  variable 
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Fig.  10.  (Color  online)  Analysis  of  a  monocentric  lens  imaging  onto  a  flat  detector,  (a)  /?max  and  (b)  *Smax  as  a  function  of  f#  for  variable  lens 
diameters. 


We  note  also  in  these  figures  that  the  FOV  defined 
by  diffraction  is  an  upper  bound.  This  indicates  that, 
for  small  f  -number  lenses,  the  size  of  the  resolution 
spot  from  aberrations  is  larger  than  the  diffraction 
spot  size.  But,  as  /’-number  increases,  the  impact 
of  aberrations  lessens  and  behavior  is  dominated 
by  diffraction.  This  behavior  is  true  for  both  plano¬ 
convex  and  monocentric  lenses  used  in  conjunction 
with  a  flat  detector. 

As  is  evident  in  Figs.  9  and  11,  this  changes  with  a 
curved  detector.  In  fact,  the  behavior  of  plano-convex 
and  monocentric  lenses  differs  considerably.  For  a 
plano-convex  lens,  the  angle  that  achieves  the  max¬ 
imum  space-bandwidth  starts  at  38°  for  an/’/i  lens, 
increases  rapidly  to  a  peak  angle,  and  decreases 
slowly  to  the  angle  defined  by  diffraction.  Lenses 
with  small  diameters  approach  the  diffraction- 
defined  angle  more  rapidly  than  large  lenses.  Large 
/’-number  lenses  are  again  dominated  primarily  by 
diffraction. 

The  shape  of  the  curves  in  Fig.  9(a)  reflect  the 
interplay  between  aberrations  and  diffraction  for 


off-axis  angles.  For  large  off-axis  angles,  the  aperture 
is  effectively  stopped  down.  This  increases  the  dif¬ 
fractive  spot  to  a  size  that  is,  apparently,  comparable 
to  the  one  generated  by  off-axis  geometric  aberra¬ 
tions.  We  note  that,  although  a  curved  detector  geo¬ 
metry  increases  the  value  of  /?max  for  small  /*#  lens, 
the  increase  in  Smax  is  small.  This  is  due  to  the  de¬ 
finition  of  space-bandwidth,  which  assumes  a  con¬ 
stant  spot  size  across  the  entire  image.  The  spot 
sizes  produced  for  large  off-axis  angles  are  larger 
than  those  for  on-axis  spots.  An  alternative  defini¬ 
tion  of  space-bandwidth,  one  that  allows  for  variation 
in  spot  size  across  the  image,  might  show  a  more  dra¬ 
matic  increase  in  space-bandwidth  with  angle. 

For  a  monocentric  lens  with  a  curved  detector,  all 
rays  provide  effectively  on-axis  performance.  How¬ 
ever,  geometric  aberrations  are  reduced  due  to  the 
stopped-down  aperture  at  large  off-axis  angles. 
Figure  11  therefore  reflects  this  reduction  in  off-axis 
geometric  aberrations. 

The  behavior  exhibited  in  Fig.  12  is  easily  ex¬ 
plained  by  Eq.  (11).  Since  the  value  of  /?max  is  the 


(a)  (b) 


Fig.  11.  (Color  online)  Analysis  of  a  monocentric  lens  imaging  onto  a  curved  detector,  (a)  / ?max  and  (b)  $max  as  a  function  of  f#  for  variable 
lens  diameters. 


20 


VJ 


90 
80 
70 
60 
50 
40 
30 
20 
10 
0 

t  10  100  1000 


diameter  [mm) 
0.050 
0.500 
5.000 
-  50,000 

-  SOO.OOG 
£000.000 


to 


1Q1G 

10™ 

1Q12 

1 010  ■ 

10B 

IQ6 

10" 

102 

10* 


10 


100 


1000 


p  fi 

(a)  (b) 


Fig.  12.  (Color  online)  Analysis  of  a  Luneburg  lens  imaging  onto  a  curved  detector,  (a)  /?max  and  (b)  *Smax  as  a  function  off#  for  variable  lens 
diameters.  Since  the  value  of  /?max  is  independent  of  f#  and  D,  all  curves  in  (a)  lie  on  top  of  one  another. 


same  in  each  case,  the  space-bandwidth  is  constant 
for  a  fixed  diameter. 

Figure  13  compares  the  performance  of  each  sys¬ 
tem  we  analyzed  for  two  different  scales,  D  = 
50  mm  and  D  =  500  mm.  These  scales  represent 
the  ones  a  designer  might  consider  for  imaging  with 
100  megapixel  to  10  gigapixel  detectors.  As  noted  in 
the  beginning  of  this  section,  we  exercise  caution  in 
making  these  comparisons  and  concentrate  primar¬ 
ily  on  trends  and  relative  magnitudes.  For /’-numbers 
greater  than  10,  detector  geometry  dominates  and  a 
curved  detector  provides  about  2  orders  of  magnitude 
improvement  in  space-bandwidth  over  a  flat  one. 
Given  that  large  f -number  lenses  have  long  focal 
lengths,  most  likely  detector  shape  dominates  be¬ 
cause  the  distinction  between  lens  characteristics 
is  marginal. 

For  /’-numbers  less  than  10,  lens  properties  domi¬ 
nate.  For  a  flat  detector,  a  monocentric  lens  provides 
approximately  2  orders  of  magnitude  improvement 
in  space-bandwidth  over  a  plano-convex  lens  and 


approximately  4  orders  of  magnitude  improvement 
for  a  curved  detector. 

For  both  scales,  the  performance  of  a  plano-convex 
lens  with  a  curved  detector  is  comparable  to  a  mono¬ 
centric  lens  with  a  flat  detector.  That  is,  equivalent 
performance  can  be  achieved  using  either  a  poor 
quality  lens  with  a  curved  detector  or  high  quality 
lens  with  a  flat  detector.  However,  replacing  a  flat  de¬ 
tector  in  a  monocentric  system  with  a  curved  detector 
improves  performance  for  low  f  -number  lenses  by 
several  orders  of  magnitude.  The  curved  detector, 
no  doubt,  takes  full  advantage  of  the  increased 
FOV  provided  by  a  monocentric  lens.  The  Luneburg 
lens,  which  essentially  represents  an  upper  bound  on 
performance,  provides  additional  improvement  over 
extremely  fast  («/*/ 1)  monocentric  lenses. 

To  underscore  the  link  between  our  work  and  [1], 
we  reformat  our  data  in  Fig.  14.  For  each  imaging 
system,  we  present  space-bandwidth  as  a  function 
of  lens  diameter  D  for  variable  f  -number  lenses.  Re¬ 
sults  for  the  Luneburg  lens  are  plotted  as  a  dashed 
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Fig.  13.  (Color  online)  *Smax  as  a  function  of  f#  for  various  imaging  systems  with  (a)  D  =  50  mm  and  (b)  D  =  500  mm.  Labels  for  the 
graphs  indicate  lens  type  (pcx — plano-convex,  me — monocentric,  lu — Luneburg)  and  detector  geometry  (flat  versus  curved). 
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Fig.  14.  *Smax  as  a  function  of  D  for  various  imaging  systems,  (a)  Plano-convex  lens  and  flat  detector,  (b)  Plano-convex  lens  and  curved 
detector,  (c)  Monocentric  lens  and  flat  detector,  (d)  Monocentric  lens  and  curved  detector. 


line  in  each  figure.  Note  that,  with  space-bandwidth 
plotted  on  a  logarithmic  scale,  a  quadratic  function  is 
a  straight  line  with  slope  2  (i.e.,  an  order  magnitude 
change  in  linear  scale  yields  2  orders  of  magnitude 
change  in  space-bandwidth). 

As  predicted  by  Lohmann  in  Fig.  1,  for  a  fixed  f- 
number  lens,  as  the  size  of  the  lens  increases,  the 
space-bandwidth  saturates  and  the  only  way  a  de¬ 
signer  can  increase  space-bandwidth  is  to  change  the 
imaging  system.  Given  the  discussion  in  [1],  this  is 
possible  only  by  increasing  the  /’-number  of  the  lens. 
This  is  in  contrast  to  our  approach,  which  considers 
alternate  lenses  and  detectors. 

Figure  14  highlights  again  the  advantages  of  a 
monocentric  lens  and  a  curved  detector.  Most 
systems  exhibit  quadratic  behavior.  However,  the 
space-bandwidth  of  f/1  and  f  / 2  lenses  are  already 
saturated  when  D  =  100  mm. 

In  addition  to  supporting  Lohmann’s  explanation 
for  lens  behavior,  our  analysis  can  also  be  used  as  an 
aid  to  design.  Figure  15(a)  indicates  the  system  volume 
Vs  required  to  achieve  a  desired  space-bandwidth  for 
each  scale  and  lens  system  we  analyzed,  where 

=  +  (12) 

V i  is  the  lens  volume,  and  Va  is  the  volume  required  to 
image  onto  the  detector.  Although  we  used  all  lens  sys¬ 
tems  to  generate  Fig.  15(a),  not  all  data  points  are 
shown.  Instead,  the  values  shown  for  each  system 
are  those  that  produce  maximum  space-bandwidth 
for  a  minimum  amount  of  volume. 

The  formulae  we  used  for  Vi  and  Va  are  listed  in 
Tables  1  and  2.  For  plano-convex  lenses,  we  used  the 


volume  determined  by  ZEMAX.  For  all  systems,  h  is 
the  axial  image  distance  from  the  lens  center.  We  as¬ 
sumed  the  volume  of  the  Luneburg  lens  is  the  same 
as  the  monocentric  lens.  There  is  no  volume  of  air  for 
the  Luneburg  lens  because  the  image  surface  is  coin¬ 
cident  with  the  lens  surface. 

The  trends  indicate  that  systems  with  flat  detector 
planes  are  least  efficient  in  terms  of  volume  for  a 
given  space-bandwidth.  An  order  of  magnitude  in¬ 
crease  in  space-bandwidth  requires  a  three-order 
magnitude  increase  in  volume  or,  more  simply,  an  or¬ 
der  magnitude  increase  in  each  dimension  of  the  box 
that  defines  the  imaging  system.  Note  that  a  mono¬ 
centric  lens,  with  its  spherical  shape,  requires  more 
volume  than  a  plano-convex  lens.  This  is  reflected  as 
a  constant  volume  offset. 

Systems  with  curved  detector  geometries  use  vol¬ 
ume  more  efficiently.  With  a  detector  plane  attached 
to  the  surface  of  the  lens,  the  Luneburg  lens  uses  vol¬ 
ume  most  efficiently.  The  volume  increases  by  only  1.5 
orders  of  magnitude  for  an  order  of  magnitude  change 
in  space-bandwidth.  The  monocentric  lens  has  a  slope 
of  1.6  and  the  plano-convex  lens,  2.2.  If  one  discounts 
the  Luneburg  lens  as  an  impossible  ideal,  one  can  con¬ 
clude  from  Fig.  15(a)  that,  given  a  fixed  amount  of  vol¬ 
ume,  the  monocentric  lens  with  a  curved  detector 
yields  the  highest  space-bandwidth. 

However,  in  addition  to  volume,  scaling  of  lens 
mass  mi  (which  is  easily  converted  to  weight)  is  also 
a  critical  parameter  of  interest.  This  is  represented 
in  Fig.  15(b).  Again,  the  values  shown  are  those  that 
produce  maximum  space-bandwidth  for  a  minimum 
amount  of  mass.  Not  surprisingly,  the  spherical 
shape  of  a  monocentric  lens  is  detrimental  in  terms 
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Fig.  15.  (Color  online)  Physical  characteristics  as  a  function  of 
space-bandwidth  for  imaging  systems  analyzed,  (a)  System  vol¬ 
ume.  (b)  Lens  mass,  (c)  System  density. 

of  mass.  A  monocentric  lens  imaging  onto  a  flat  de¬ 
tector  geometry  is  least  efficient  in  terms  of  mass  be¬ 
cause  a  larger  lens  is  required  to  achieve  the  same 
space-bandwidth  when  compared  to  a  curved  detec¬ 
tor  geometry.  The  mass  of  a  plano-convex  lens  is  re¬ 
duced  with  increasing  space-bandwidth  because  lens 
/’-number  increases  and,  therefore,  its  thickness  is 
reduced.  [See  Fig.  16  and  the  discussion  contained 
in  Appendix  A.]  Thus,  an  order  of  magnitude  increase 
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in  space-bandwidth  reduces  the  mass  by  one-third  an 
order  of  magnitude. 

System  density  p  as  a  function  of  space-bandwidth 
is  represented  in  Fig.  15(c),  where 


mi 


We  use  this  definition  based  on  our  assumption  that 
the  lens  is  the  primary  contributor  to  system  mass. 
Note  that  systems  with  monocentric  lenses  exhibit  a 
constant  density,  whereas  the  density  of  plano¬ 
convex  systems  decreases  with  increasing  space- 
bandwidth.  This  reduction  in  density  is  due  to  the 
reduction  in  mass  as  mentioned  previously.  The  con¬ 
stant  density  for  monocentric  lens  systems  indicates 
lens  size  is  the  dominant  characteristic.  An  increase 
in  space-bandwidth  demands  an  increase  in  lens 
radius,  which  dictates  system  volume  and  mass. 

If  one  compares  Figs.  15(a),  15(b),  and  15(c),  it  is  ap¬ 
parent  that  the  weight  advantage  offered  by  plano¬ 
convex  lenses  is  negated  by  its  large  volumetric  costs. 
Conversely,  a  heavy  system  is  the  price  one  pays  for 
the  optical  performance  provided  by  a  monocentric 
lens. 

In  our  final  comment,  we  note  that  space- 
bandwidth  is  a  measure  of  optical  resolution  ele¬ 
ments,  or  resels,  which  differs  from  the  number  of 
pixels  in  a  detector  array.  Depending  upon  the  size 
of  a  single  detector  pixel,  the  number  of  pixels  per 
resel  can  vary.  There  should  be  at  least  one  pixel 
per  resel  but  the  number  can  be  between  four  and 
eight  for  a  well  sampled  system.  Given  this  caveat, 
Fig.  15  can  provide  some  indication  about  the  system 
scale  and  weight  required  to  achieve  a  given  imaging 
capacity.  For  example,  Fig.  15  indicates  that  a  10- 
gigapixel  (Gpx)  imager  should  require  on  the  order 
of  1  m3  of  volume. 

5.  Summary  and  Conclusion 

We  examined  the  space-bandwidth  of  wide  FOV  ima¬ 
ging  systems  as  the  systems  scale  in  size.  We  ex¬ 
tended  Lohmann’s  analysis  for  plano-convex  lenses 
imaging  onto  flat  focal  geometries  to  systems  with 
monocentric  lenses  and  curved  focal  geometries, 
which  have  been  proposed  as  technologies  to  provide 
high  resolution  wide  FOV  imaging.  To  understand 
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Fig.  16.  (Color  online)  Optical  system  scaling. 


system  cost,  and  not  just  performance,  we  also  as¬ 
sessed  the  volume  and  mass  associated  with  these 
systems. 

The  most  straightforward  conclusion  we  draw,  that 
a  curved  detector  improves  the  performance  of  an 
imaging  system,  provides  justification  for  the  DARPA 
Hemispherical  Array  Detector  for  Imaging  program 
[8] .  Our  analysis,  however,  also  supports  development 
of  monocentric  lens  systems.  Our  analysis  indicates 
that  monocentric  lenses  imaging  onto  a  curved  detec¬ 
tor  outperform  other  systems  for  the  same  design  con¬ 
straints  but  do  so  at  a  cost  in  lens  weight. 

More  important  than  simply  providing  these  sim¬ 
ple  conclusions,  our  work  also  provides  a  framework 
for  analysis.  Our  particular  instantiation  of  the  fra¬ 
mework  suggests  the  above  conclusions  but  we  recog¬ 
nize  that  further  investigation  is  required  to  validate 
their  universality.  For  example,  one  can  use  our 
analysis  technique  to  consider  the  performance  of 
multiscale  optical  systems,  such  as  those  for  chip- 
to-chip  optical  interconnects  [14]  and  those  to  reduce 
aberrations  in  wide  FOV  imaging  [15]. 

We  note  that  Fig.  15  considers  only  scaling  the  op¬ 
tics.  An  analysis  useful  for  a  designer  must  consider 
scaling  of  the  postdetection  electronics  as  well.  In 
fact,  a  more  exact  analysis  would  consider  the  cur¬ 
ved  detector  approximated  by  a  collection  of  two- 
dimensional  flat  detector  arrays.  Is  there  an  optimal 
array  size  under  such  conditions? 

Finally,  in  this  work  we  have  considered  only  opti¬ 
cal  means  to  reduce  the  impact  of  aberrations.  We 
have  not  considered  the  performance  of  optics  com¬ 
bined  with  electronics.  A  recent  analysis  based  on 
Lohmann’s  heuristic  link  between  /’-number  and 
scale  indicates  some  improvement  is  possible  [10]. 
If  the  scale  factor  M  =  fj2  as  Lohmann  prescribes, 
space-bandwidth  scales  roughly  as  M4/3.  Using  com¬ 
putation  after  detection,  it  is  possible  to  change  this 
to  M2-8/1*9  ~  M3/2.  This  remains  to  be  verified.  Even  if 
true,  it  is  imperative  that  a  designer  know  the  cost  of 
implementation  for  a  given  level  of  performance. 
That  is,  which  implementation  is  more  costly,  a 
monocentric  lens  with  a  curved  detector  or  a  plano¬ 


convex  lens  with  a  curved  detector  followed  by  com¬ 
putation?  We  hope  to  create  an  analysis  framework 
that  will  allow  us  to  address  this  point. 

Appendix  A:  Lens  Analysis 

We  describe  in  this  appendix  the  procedures  we  used 
to  optimize  and  determine  lens  performance.  The 
procedures  were  embodied  in  several  C++  programs 
which  we  used  to  drive  the  optical  system  design  and 
analysis  software  package,  ZEMAX,  externally.  We 
used  its  raytracing  and  optimization  capabilities  to 
collect  a  huge  set  of  data.  Figure  16  indicates  the  re¬ 
lative  scale  of  the  systems  we  considered.  The  width 
of  the  illuminating  beam  is  constant  in  each  case.  Be¬ 
cause  of  the  wide  range  of  lens  parameters  in  which 
we  were  interested  for  this  study,  ZEMAX  had  trou¬ 
ble  converging  to  physical  solutions  in  all  cases.  Each 
lens  type  and  size,  therefore,  required  some  manual 
adjustment  to  generate  usable  and  repeatable  data. 
Nevertheless,  the  capability  to  automate  the  data 
collection  process  proved  invaluable  in  collecting 
over  13,000  spot  size  data  for  further  analysis. 
Figure  17  presents  representative  spot  diagrams 
at  different  angular  positions  for  each  of  the  imaging 
systems  at  a  single  scale  and  /’-number.  The  spot  size 
predicted  by  diffraction  in  each  case  is  considerably 
less  than  the  scales  shown. 

Plano-Convex  Lenses 

To  achieve  a  desired  /’-number  for  a  plano-convex 
lens  with  a  fixed  diameter,  we  adjusted  the  curvature 
of  the  front  surface,  the  thickness  of  the  lens,  and  the 
distance  between  the  lens  back  surface  and  the  on- 
axis  focal  point.  We  located  the  flat  detector  plane 
at  the  paraxial  focus. 

With  the  detector  plane  fixed,  we  varied  the  angle  of 
the  incident  beam  between  0  and  60°  in  2°  increments 
and,  for  each  angle,  adjusted  the  lens’  curvature  and 
thickness  to  maintain  a  constant  image  plane  and 
constant  /*#.  We  determined  the  spot  size  assuming 
a  flat  detector  plane  by  launching  a  large  number 
of  rays  from  a  particular  field  point,  including  the 
chief  ray,  into  the  entrance  pupil  of  the  lens,  and  tra¬ 
cing  them  through  the  optics  to  the  image  surface.  We 
calculated  the  spot  size  as  the  root-mean-square  of  the 
differential  distance  between  the  location  of  the  chief 
ray  and  the  locations  of  the  other  rays.  We  also  calcu¬ 
lated  the  diffraction  spot  size  from  Airy  disk. 

To  determine  S ,  we  inserted  these  values  into 
Eq.  (2)  and,  in  accordance  with  Lohmann,  assumed 
the  value  of  resolution  spot  size  was  valid  across  the 
entire  detector  plane.  We  selected  the  maximum 
value  of  S  over  the  angles  calculated. 

Without  changing  the  lens,  we  modified  the  curva¬ 
ture  of  the  image  surface  to  measure  the  spot  size  on 
a  curved  detector  plane.  We  set  the  curvature  of  a 
spherical  surface  such  that  the  lens  /*#  was  main¬ 
tained  yet  produced  the  smallest  spot  size  for  that 
field  angle  (i.e.,  we  determined  the  circle  of  least  con¬ 
fusion).  Although  the  curvature  is  slightly  different 
for  each  field  point,  all  surfaces  intersect  the  optic 
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Fig.  17.  (Color  online)  Representative  spot  shapes  from  analysis. 


axis  at  the  paraxial  focus.  The  curved  surface,  in 
effect,  followed  that  of  the  Petzval  surface. 

Monocentric  Lenses 

The  monocentric  lens  consisted  of  a  central  spherical 
element  and  two  concentric  outer  spherical  shells,  all 
concentric  around  the  same  point.  We  specified  that 
all  elements  were  made  of  commercial  glasses  and 
cemented  to  each  other.  Analysis  of  the  lens  was  si¬ 
milar  to  that  for  plano-convex  lens. 

To  determine  the  optimum  position  of  the  curved 
image  surface,  we  assumed  it  was  spherical  and  con¬ 
centric  with  the  lens,  and  we  optimized  the  lens  per¬ 
formance  (primarily  f#)  only  at  two  field  points,  0° 
and  60°. 

We  determined  the  geometrical  spot  size  for  inci¬ 
dent  angles  ranging  from  0  to  80°  in  2°  increments. 
It  was  necessary  to  increase  the  range  of  angles  for 
the  monocentric  lenses  because  they  reached  their 
maximum  above  60°.  Small  lenses  reached  their 
maximum  space-bandwidth  below  80°,  but  ZEMAX 
did  not  produce  stable  results  when  we  analyzed 
large  lenses  above  80°.  We  therefore  limited  our 
maximum  angle  for  analysis  to  80°. 

Using  ray  tracing,  we  calculated  first  the  geometri¬ 
cal  spot  size  on  a  curved  image  surface.  We  then  cal¬ 
culated  the  geometrical  spot  size  on  a  flat  image 
plane  by  placing  the  image  plane  at  the  point  where 
the  curved  plane  intersected  the  optic  axis.  To 
achieve  the  required  focal  length  for  the  spherical 
elements,  we  changed  the  lens  /’-number  by  adjust¬ 
ing  the  lens  size  while  keeping  its  aperture  constant. 

Luneburg  Lenses 

We  used  the  Luneburg  lens  as  the  limiting  case  on 
performance  for  a  similarly  sized  monocentric  lens. 
We  scaled  the  Luneburg  lens  to  match  the  f  -number 
and  aperture  of  a  monocentric  lens  yet  simulta¬ 
neously  provide  aberration-free  imaging.  Because  a 


fully  illuminated  Luneburg  lens  is  always  f / 0.5, 
we  had  to  stop  down  the  lens  to  achieve  f  / 1.  To  com¬ 
ply  with  Lohmann’s  approach,  we  changed  the  lens  f- 
number  by  changing  its  lens  size  while  keeping  its 
aperture  size  constant. 

Because  ZEMAX  was  unable  to  trace  accurately 
the  off-axis  rays  through  this  type  of  lens,  we  gener¬ 
ated  most  of  our  data  for  the  Luneburg  imaging 
system  by  calculating  the  diffraction  spot  size  nu¬ 
merically  and  zeroing  out  the  geometrical  aberra¬ 
tions  generated  by  a  monocentric  lens.  We  used 
ZEMAX  to  validate  some  of  our  results  to  insure 
the  performance  for  rays  at  any  field  angle  was 
equivalent  to  that  for  an  on-axis  ray. 

We  note  that  we  did  not  use  the  Luneburg  lens  in  a 
way  that  provided  the  best  possible  performance  in 
the  smallest  possible  volume.  To  do  so,  we  could  have 
modified  the  index  of  the  Luneburg  lens  to  reduce 
volume,  but  we  would  have  been  forced  to  use  a  dif¬ 
ferent  index  profile  for  each  f  -number.  This  violated 
our  basic  postulate  not  to  expend  considerable  effort 
to  optimize  lens  performance. 
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Abstract:  We  report  results  of  an  ongoing  study  designed  to  assess  the 
ability  for  enhanced  detection  of  recently  buried  land-mines  and/or 
improvised  explosive  devices  (IED)  devices  using  passive  long-wave 
infrared  (LWIR)  polarimetric  imaging.  Polarimetric  results  are  presented  for 
a  series  of  field  tests  conducted  at  various  locations  and  soil  types.  Well- 
calibrated  Stokes  images,  SO,  SI,  S2,  and  the  degree -of-linear-polarization 
(DoLP)  are  recorded  for  different  line-of-sight  (LOS)  slant  paths  at  varying 
distances.  Results  span  a  three-year  time  period  in  which  three  different 
LWIR  polarimetric  camera  systems  are  used.  All  three  polarimetric  imaging 
platforms  used  a  spinning-achromatic-retarder  (SAR)  design  capable  of 
achieving  high  polarimetric  frame  rates  and  good  radiometric  throughput 
without  the  loss  of  spatial  resolution  inherent  in  other  optical  designs. 
Receiver-operating-characteristic  (ROC)  analysis  and  a  standardized 
contrast  parameter  are  used  to  compare  detectability  between  conventional 
LWIR  thermal  and  polarimetric  imagery.  Results  suggest  improved 
detectability,  regardless  of  geographic  location  or  soil  type. 
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1.  Introduction 

Both  military  and  civilian  personnel  are  facing  an  ever-evolving  threat  from  buried/concealed 
landmines  and  improvised  explosive  devices  (IEDs).  There  has  been  significant  research 
dedicated  to  the  detection  of  buried  explosive  devices  [1-6].  One  particular  technology  that 
has  shown  promise  is  forward-looking  ground  penetrating  radar  (FLGPR)  [7-9].  However, 
recent  studies  have  identified  several  inherent  problems  associated  with  FLGPR,  e.g.,  buried 
explosives  that  are  formed  from  dielectric-  or  polymer-based  materials  (plastics)  are  difficult 
to  detect  due  to  the  small  electromagnetic  (EM)  radar  cross-sections  for  non-conducting 
materials  [10,11].  In  addition,  FLGPR  systems  are  plagued  by  unacceptable  false-alarm  rates 
due  to  the  detection  of  commonly  buried  debris.  A  consensus  has  emerged  that  two  or  more 
complimentary  technologies  will  most  likely  be  required  to  improve  detectability  while 
reducing  false-alarm  rates.  One  such  complimentary  approach  may  involve  a  combination  of  a 
FLGPR  system  with  an  optically  based  imaging  platform  capable  of  detecting  surface 
anomalies,  i.e.,  disturbed  earth  (DE),  that  result  when  explosive  devices  are  buried/concealed 
near  the  surface  of  a  given  terrain. 

One  suggested  imaging  technique  for  the  remote  detection  of  DE  involves  various  forms 
of  spectroscopic  imaging  in  the  thermal  infrared  (IR),  sometimes  termed  multi-  or  hyper- 
spectral  imaging  [12-14].  These  techniques  attempt  to  exploit  the  so-called  “reststrahlen” 
effect,  in  which  the  bulk  emissivity  for  a  particular  soil  changes  within  8-1 0pm  spectral  range 
due  to  absorption  at  the  reststrahlen  frequencies,  which  are  approximately  equal  to  the  natural 
frequencies  of  certain  crystalline  structure  associated  with  small  semi-transparent  silica-based 
particles  [15-17]  Although  well-documented,  the  reststrahlen  effect  has  been  shown  to  be 
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quite  variable  depending  on  geographic  location  and  soil  composition.  Most  research  shows  a 
3-5%  variance  in  the  IR  emissivity  associated  to  the  reststrahlen  phenomena  under  optimal 
conditions. 

We  consider  a  new  imaging  approach  based  on  changes  in  polarization  state  associated 
with  radiation  that  is  emitted  and/or  reflected  from  a  surface  that  has  recently  been  altered 
[18-20].  The  premise  for  considering  polarimetric  imaging  is  based  on  the  fact  that  both 
manmade  and  naturally  occurring  terrain  establishes  an  “average”  polarization  profile  or 
pattern  that  results  from  vehicle  traffic,  weathering,  or  just  the  passage  of  time.  Since  the 
polarization  state  of  the  image  forming  radiation  is  extremely  sensitive  to  subtle  changes  in 
the  geometry  of  reflecting/emitting  surface,  resultant  differences  in  polarization  signatures 
arise  for  localized  surface  regions  that  have  recently  been  disturbed. 

For  our  application,  we  chose  to  use  a  Stokes  parameter  approach  to  describe  the 
polarization  state  of  the  radiation  that  is  emitted  and/or  reflected  from  a  target  area  [21].  We 
apply  the  Stokes  methodology  to  an  imaging  application  where  we  define  the  Stokes  “images” 
SI,  S2,  and  SO,  by  the  usual  convention  shown  in  Eqs.  (1)— (3), 

SI  =  1(0)  -  1(90)  (w/sr-m2),  (1) 

S2  =  l(  +  45)  -  l(-45)  (  w  /  sr  -  m2 ) .  (2) 

For  total  linear  polarization,  the  total  radiance  image,  SO,  is  defined  as, 

SO  =  l(0)  +  I (90)  total  radiance  (w/sr-m2),  (3) 


and  the  degree-of-linear  polarization  (DoLP)  image  is  expressed  as, 


DoLP  = 


Vsi2  +S22 
50 


(4) 


where  1(0),  1(90),  I(  +  45),  and  I(-45)  represent  well-registered  (spatially)  images  produced 
with  polarimetrically  filtered  radiance  (w/sr-m2)  at  orientation  angles  0°,  90°,  +  45°,  and  -45°, 
respectively,  where  0°  is  defined  as  the  vertical  with  respect  to  the  image  plane.  As  one  can 
see  from  Eqs.  (l)-(3),  the  SI  image  represents  a  relative  measure  of  the  vertical  compared  to 
the  horizontal  component,  the  S2  image  represents  a  relative  measure  of  the  difference 
between  the  two  ±45°  diagonal  states,  and  the  SO  image  is  merely  a  conventional  “intensity 
only”  image. 

This  study  represents  a  compilation  of  results  spanning  a  three -year  period.  The  three 
field-tests  presented  here  were  conducted  in  2008,  2009,  and  2011,  to  assess  the  viability  for 
using  passive  LWIR  (8-  12pm)  polarimetric  imaging  to  identify  regions  of  recently  DE.  The 
primary  goal  was  threefold — 1)  objectively  measure  the  ability  to  detect  regions  of  disturbed 
soil  associated  with  the  placement  of  buried  land-mines  and/or  IEDs;  2)  assess  how 
detectability  is  effected  varying  soil  type  and  composition;  and  3)  determine  how  LWIR 
polarimetric  DE  signatures  are  effected  by  atmospheric  conditions,  e.g.,  clear  sky,  cloud 
cover,  rain,  wind,  etc. 

During  the  time  period  that  this  report  encompasses,  three  different  LWIR  polarimetric 
imaging  platforms  were  used,  and  all  were  produced  by  Polaris  Sensor  Technologies,  Inc., 
located  in  Huntsville,  AL.  It  should  be  noted  that  the  use  of  three  different  LWIR  polarimetric 
imagers  was  dictated  not  by  choice,  but  rather  by  necessity.  After  an  initial  proof-of-concept 
study  in  2008,  a  first-generation  LWIR  polarimetric  imager  experienced  technical  issues  that 
delayed  further  study.  Early  in  2009,  a  LWIR  microbolometer -based  polarimetric  sensor 
became  available,  which  allowed  for  our  work  to  continue.  Finally,  in  2010,  a  state-of-the-art 
LWIR  polarimetric  imager  became  available  and  was  used  in  the  final  phase  of  the  study. 

Although  there  are  a  variety  of  optical  configurations  appropriate  for  polarimetric 
imaging,  (e.g.,  division-of-amplitude  (DoA),  division-of-focal-plane  (DoFP),  and  division-of- 
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aperture  (DoAP)),  we  chose  a  division-of-time  (DoT)  approach  based  on  a  spinning 
achromatic  retarder  (SAR)  design  for  recording  calibrated  LWIR  Stokes  imagery  [22-26]. 
Because  the  DoT  method  relies  on  the  capture  and  differencing  of  sequentially  recorded 
images,  it  is  only  appropriate  for  imaging  objects  that  are  slowly  moving  or  static  within  the 
scene.  Although  somewhat  limited  by  the  sequential  nature  of  the  recorded  imagery,  it  is  by 
far  the  best  choice  for  basic  research  applications  due  to  maximum  radiometric  throughput, 
spatial  resolution,  and  polarimetric  sensitivity. 

2.  Spinning  achromatic  retarder  (SAR)  polarimetric  sensors 

A  spinning  achromatic  retarder  (SAR)  imaging  polarimeter  operates  by  capturing  a  sequence 
of  images  in  time.  Each  image  in  the  sequence  is  recorded  at  a  different  orientation  position  of 
a  spinning  achromatic  retarder.  In  its  principle  mode  of  operation,  the  system  acquires  a  set  of 
16  images  per  rotation  of  the  retarder — i.e.,  images  are  captured  at  0,  22.5,  45. . .  to  337.5°. 


Fig.  1 .  Basic  design  of  a  spinning  achromatic  retarder  (SAR)  LWIR  polarimetric  imager. 

Figure  1  shows  the  basic  design  of  a  LWIR  SAR-based  imaging  polarimeter  in  which 
either  a  room-temperature  microbolometer,  or  a  cryogenically  cooled  Mercury  Cadmium 
Telluride  (MCT)  focal-plane-array  (FPA)  detector  is  positioned  at  the  image -plane  of  the 
sensor.  In  general,  we  have  found  the  cooled  MCT -based  FPAs  to  exhibit  a  noise -equivalent 
DoLP  or  NEDoLP  (similar  to  NEAT  for  conventional  thermal)  on  the  order  of  ±  0.1%, 
whereas  the  microbolometer-based  systems  typically  exhibit  NEDoLP  values  in  the  range  of 
±  0.3-0. 5%. 

For  MCT-based  systems,  much  of  the  optical  train  is  held  under  vacuum  within  a  Dewar 
and  cooled  to  an  approximate  temperature  of  88K.  The  achromatic  retarder  is  mounted  just 
outside  the  Dewar  window  and  is  mounted  to  a  precision  set  of  frictionless  bearings.  A  series 
of  relay  optics  are  used  to  reduce  any  beam  wander  generated  by  the  rotating  optic,  and  by 
using  this  configuration  pixel,  registration  error  between  sequential  frames  is  typically  less 
than  l/20th  of  a  pixel.  The  retarder  is  rotated  continuously  by  a  stepper  motor  at  variable 
rates,  depending  on  the  specified  integration  period  and  application.  All  three  SAR-based 
systems  used  in  this  study  are  designed  to  record/display  a  user-specified  set  of  Stokes  and/or 
polarimetric  image  products  at  processing  rates  approaching  real-time.  An  excellent  review  of 
current  polarimetric  imaging  technologies  can  be  found  in  Tyo,  Goldstein,  Chenault,  and 
Shaw  [27]. 
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3.  Detection  analysis 

Perhaps  one  of  the  more  difficult  tasks  involved  with  image  detection  analysis  is  developing 
an  objective  evaluation  metric  that  consistently  and  correctly  identifies  the  “best”  image  type 
for  maximum  detectability.  Much  of  the  difficulty  arises  from  the  fact  that  optimum 
detectability  is  so  heavily  dependent  on  the  type  of  end-user  one  considers,  e.g.,  human  or 
computer  algorithm.  To  address  this,  we  chose  to  use  two  established  image  evaluation 
metrics — i.e.,  detection  calculations  based  on  a  receiver-operational-characteristic  (ROC) 
curve  approach,  and  the  more  intuitive,  standardized  contrast  parameter  method.  Since  both 
techniques  have  inherent  strengths  and  weaknesses,  we  present  to  the  reader  actual  LWIR 
thermal  and  polarimetric  image  sets  for  subjective,  yet  sometimes  more  informative, 
evaluation. 

The  receiver-operational-characteristic  (ROC)  analysis  is  often  the  tool  of  choice  among 
researchers  within  the  artificial  intelligence  (Ai)  and  automated-target-recognition  (ATR) 
community.  The  ROC  method  was  originally  developed  for  signal  detection  analysis  but  is 
now  widely  applied  in  many  different  disciplines  [28-30].  ROC  curve  analysis  is  used  to 
compare  target  detectability  between  different  image  sets  recorded  or  processed  by  different 
means.  In  order  to  do  this,  some  a  priori  knowledge  about  the  location  of  the  actual  target  is 
necessary  so  that  a  “truth”  image  can  be  generated.  Figure  2(a)  shows  an  image  histogram  of 
an  example  truth  image  where  the  large  Gaussian  like  curve  on  the  left  represents  the  pixel 
values  associated  with  the  background,  and  the  smaller  distribution  on  the  right  represents  the 
pixel  values  associated  with  the  target.  The  vertical  line  in  the  figure  represents  an  arbitrary 
threshold  point.  Also  shown  are  regions  defined  by  the  intersection  of  the  two  histograms,  as 
well  as  regions  to  the  right  and  left  of  the  threshold  line  identified  as  true -negative  (TN),  true¬ 
positive  (TP),  false-positive  (FP),  and  false-negative  (FN)  regions. 

A  ROC  curve  is  generated  by  comparing  the  overlapping  regions  TN,  TP,  FP,  and  FN,  as 
the  threshold  point  is  swept  right  to  left  across  the  histogram.  Figure  2(b)  shows  a  resultant 
ROC  curve  for  the  histograms  shown  in  Fig.  2(a).  The  area  under  the  ROC  curve  is  defined  as 
the  normalized  probability  for  detection  of  the  target  identified  in  the  truth  image. 

One  inherent  weakness  associated  with  the  ROC  curve  approach  stems  from  the  fact  that  it 
is  a  purely  statistical  method  and  fails  to  take  into  account  addition  spatial  information 
associated  with  localized  target  pixel  location  and/or  clustering — an  important  aspect  of  visual 
cognitive  detection.  The  human  eye  can  often  decipherer  target  regions  within  a  scene  based 
on  very  subtle  variations  among  clusters  of  pixels  that  form  a  particular  distinguishable  shape. 
Nevertheless,  the  ROC  method  is  a  readily  accepted  metric  among  Ai/ATR  community  and 
does  offer  an  objective  measure  of  target  detectability. 
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ROC  curve 


False  positive  rate  (1-specrf icrtyj 


(a)  (b) 

Fig.  2.  2(a)  A  image  histogram  where  the  target  (right  Gaussian)  and  background  (left 
Gaussian)  regions  are  defined  and,  2(b)  the  corresponding  ROC  curve,  where  the  area  under 
the  curve  is  related  to  the  probability  for  detection  the  target  within  the  scene. 

A  second  evaluation  method  for  grading  imagery  for  maximum  detectability  involves 
calculating  a  “standardized”  contrast  parameter  [31].  At  the  most  fundamental  level,  the 
ability  to  detect  a  given  object  within  an  image  is  heavily  dependent  on  the  magnitude  of  the 
difference  between  pixel  values  associated  with  the  object  and  its  associated  background,  i.e., 
contrast.  However,  in  order  to  compare  pixel  values  that  result  from  different  image  types — 
e.g.,  thermal,  Stokes,  DoLP,  etc. — a  standardization  process  must  be  applied  to  the  entire 
image  set.  This  is  a  common  procedure  used  to  normalize  multivariate  image  sets  before 
applying  a  particular  evaluation  metric.  The  standardization  process  effectively  translates  each 
image  histogram  (derived  from  different  physical  quantities)  onto  the  same  basis  set  of 
coordinate  axes.  The  standardization  procedure  involves  subtracting  the  mean  pixel  value 
derived  by  the  entire  image,  and  dividing  the  resultant  histogram  by  the  standard  deviation. 
This  multivariate  normalization  process  by  no  means  affects  overall  integrity  and  information 
content  of  the  image.  After  image  standardization,  separate  ROIs  are  defined  for  the  target 
and  background  regions  for  a  given  image  set,  and  the  average  pixel  value  for  each  region  is 
computed.  Finally,  a  standardized  contrast  parameter  is  calculated  for  each  image  and  is 
defined  as, 


C°\uT-uB\,  (5) 

where  uT  and  uB  represent  the  average  pixel  values  for  the  target  and  background  ROIs, 
respectively. 

4.  Experiment  (test  sites)  and  results 

The  first  test  was  conducted  on  May  22,  2008,  and  was  located  at  the  U.S.  Army  Research 
Laboratory  (ARL),  Adelphi,  MD,  on  a  test  surface  best  described  as  a  well-traveled  dirt  road 
consisting  of  a  gravel-clay-soil  mixture  that  was  well-compacted.  The  test  was  conducted  over 
a  6-h  period  during  mid-afternoon  under  clear  skies,  with  relative  humidity  approximately 
50%  and  temperatures  varying  from  77  to  81  °F.  Holes  were  dug  approximately  12  in  into  the 
hardened  road  surface  and  surrogate  IED  targets  were  buried  at  different  locations,  see  Fig.  3. 

For  this  particular  test,  we  used  our  lowest  resolution  256  x  256  MCT  FP  A -based  SAR 
polarimetric  camera  system.  The  polarimetric  imager  was  mounted  on  a  tripod  and  positioned 
2.75  m  above  the  ground  and  was  focused  on  the  DE  region  at  an  approximate  distance  of  10 
m  away  as  shown  in  Fig.  4.  A  50  mm  LWIR  objective  lens  was  fitted  to  the  polarimetric 
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sensor,  which  produced  an  effective  field-of-view  (FOV)  of  15°.  The  camera  LOS  was  angled 
to  the  DE  region,  resulting  in  a  range  of  grazing  angles  from  15  to  20°  defined  by  the  LOS 
and  the  road  surface.  It  should  be  noted  that  for  this  first  proof-of-concept  test,  great  care  was 
taken  to  camouflage  the  disturbed  region  as  best  as  possible,  i.e.,  not  readily  noticeable  to  a 
casual  observer,  see  Fig.  3(b).  After  the  disturbed  regions  reached  thermal  equilibrium  after 
approximately  1  h,  a  series  of  four  image  sets  were  recorded  at  15 -min  intervals. 


(a)  (b) 

Fig.  3.  Photograph  of  DE  region  May  22,  2008  test  conducted  at  U.S.  Army  Research 
Laboratory,  Adelphi,  MD  site.  3(a)  DE  test-bed  (red  oval  area  represents  buried  target  and  3(b) 
a  close  up  of  the  DE  region  (clay-gravel-soil  mixture). 


Fig.  4.  Schematic  for  the  initial  test  conducted  on  May  22,  2008  showing  the  positioning  of  the 
LWIR  polarimetric  camera  with  respect  to  the  DE  test  region. 


Figure  5  shows  the  resultant  imagery  consisting  of  a  conventional  LWIR  thermal  image, 
SO,  the  two  Stokes  images,  SI  and  S2,  and  DoLP  image,  where  a  false  color  has  been  applied 
to  all  the  original  grey-scale  images.  Note  that  all  Stokes  image  values  presented  here  are 
normalized  with  respect  SO,  and  range  from  -1  to  1.  Table  1  shows  the  average  absolute 
radiance  and  normalized  Stokes  values  for  ROIs  that  are  defined  as  either  the  DE  or 
background  regions. 


(a)  (b)  (c)  (d) 

Fig.  5.  (a)  conventional  LWIR  thermal  image,  SO,  for  the  DE  region  highlighted  in  Fig.  3 
recorded  on  May  22,  2008.  Figures  5(b) -(d)  show  the  resultant  Stokes  images  SI,  S2,  and 
degree-of-linear-polarization  (DoLP)  image  where  the  DE  region  is  shown  by  an  identifying 
arrow. 
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As  one  can  see  in  Fig.  5(a),  the  ability  to  distinguish  disturbed  from  undisturbed  soil 
regions  is  quite  poor  for  the  conventional  LWIR  thermal  image,  SO,  and  reflected  by  the 
lowest  calculated  ROC  curve  value,  0.256  (Fig.  6),  and  the  lowest  contrast  parameter  value, 
0.056,  shown  in  Table  2.  The  DE  region  becomes  visible  in  the  Stokes  image  SI  (Fig.  5b), 
where  contrast  arises  from  the  fact  that  the  DE  region  emits  thermal  radiation  that  is  slightly 
less  polarized,  when  compared  to  the  surrounding  undisturbed  area  (Table  1).  Note  that  since 
all  normalized  SI  values  are  negative,  the  majority  of  the  polarization  lies  in  the  horizontal 
plane,  based  on  the  definitions  shown  in  Eqs.  (1-2),  which  is  associated  with  “emission” 
dominant  polarization.  Conversely,  a  positive  S 1  value  implies  that  the  vertical  component  is 
dominant,  and  the  majority  of  the  received  radiance  is  due  to  “reflection”  of  the  ambient 
optical  background. 

Table  1.  Average  radiant  and  polarimetric  values  for  DE  and  background  ROI  regions 
for  images  shown  in  Figs.  5(a)-5(d),  recorded  on  May  22,  2008. 


May  22,  2008  Test 

SO  (watt/sr-m2) 

S1/S0 

S2/S0 

DoLP(%) 

DE  region  (ROI  average) 

27.36 

0.041 

-0.081 

2.843 

background  (ROI  average) 

27.19 

0.048 

-0.058 

3.044 

Similar  evaluation  of  the  S2  image  shows  further  improvement  in  target  detectability  and 
is  reflected  by  the  highest  calculated  ROC  curve  and  contrast  parameter  values  of  0.958  and 
1.745,  respectively.  Again,  since  the  values  for  the  normalized  S2  image  shown  in  Table  1  are 
negative,  the  dominant  polarization  state  is  oriented  at  -45°,  with  respect  to  the  vertical.  In  a 
scene  in  which  the  LOS  to  the  lay  of  the  surface  is  perfectly  symmetric,  we  would  expect  the 
values  of  the  S2  image  to  be  nearly  zero — i.e.,  ground  is  surface  flat  and  level,  and  the  region 
of  interest  is  centered.  However,  due  to  the  slope  of  the  ground  surface  and  the  fact  that  the 
camera  mount  was  off-center  with  respect  to  the  DE  region,  a  larger  than  normal  difference 
arose  between  the  +  45°  and  -45°  states.  Figure  5(d)  shows  the  DoLP  image,  which  is  merely 
the  normalized  superposition  of  the  Stokes  images  SI  and  S2.  A  lower  contrast  parameter 
value  of  0.993  is  not  unexpected  since  the  DoLP  product  contains  noise  components  from  SI, 
S2,  and  SO,  which  in  this  case,  results  in  a  slight  reduction  in  the  overall  contrast  for  the  DoLP 
product  image. 


Fig.  6.  Corresponding  ROC  curves  calculated  for  images  shown  in  Figs.  5(a)-5(d),  in  which 
the  probability  of  detection  is  defined  as  the  integrated  area  under  each  respective  curve. 
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Table  2.  Comparison  of  the  contrast  parameter  and  ROC  curve  results  for  the  image  set 
shown  in  Figs.  5(a)-5(d),  recorded  on  May  22,  2008. 


May  22,  2008  Test 

SO 

SI 

S2 

DoLP 

contrast  parameter 

0.056 

0.790 

1.745 

0.993 

probability  of  detection 

0.256 

0.866 

0.958 

0.688 

A  follow-on  series  of  DE  tests  were  conducted  over  a  multi-day  period  from  August  27- 
September  4,  2009.  The  location  was  again  the  Adelphi,  MD,  area,  which  included  the 
original  May  22,  2008,  dirt  road  site,  as  well  as  two  new  locations  in  which  the  soil 
compositions  for  each  location  is  characterized  as  red-clay-silt  mixture,  and  a  topsoil  type 
material,  rich  in  organic  material  and  small  stone.  As  previously  mentioned,  the  original 
liquid  nitrogen  (LN2)-cooled  256  x  256  MCT  FPA  SAR  polarimetric  sensor  was  unavailable 
during  this  period  and  a  new  324  x  256  FPA  microbolometer-based  SAR  polarimetric  imager 
was  substituted  in  its  place. 

The  first  test  was  conducted  on  August  27,  2009,  at  a  local  baseball  field  in  Adelphi,  MD. 
The  site  was  chosen  due  to  its  unique  soil  type  similar  to  what  is  found  in  various  regions  of 
Southeast  Asia,  see  Fig.  7(a).  Once  again,  holes  were  dug  and  surrogate  objects  were  buried  at 
various  locations  at  depths  <  1  m.  Similar  to  the  first  test  conducted  in  2008,  each  hole  was 
carefully  raked  and  brushed  over  to  camouflage  the  fact  that  digging  had  occurred,  see  Fig. 
7(b).  The  microbolometer  SAR  polarimetric  imager  was  set  up  in  a  similar  manner  as  in  prior 
tests,  with  the  exception  that  the  camera  was  mounted  at  a  slightly  lower  to  a  height  of  2  m 
above  the  ground.  This  resulted  in  a  viewing  angle  (as  defined  by  the  FOS  and  the  soil 
surface)  that  ranged  from  10  to  16°,  depending  on  the  location  of  interest  within  the  scene. 
Once  the  disturbed  regions  reached  thermal  equilibrium  with  the  surrounding  background, 
capture  of  polarimetric  imagery  began  and  was  recorded  every  1 5  min  for  approximately  4  h. 


(a) 


(b) 


Fig.  7.  Test  area  recorded  on  August  27,  2009.  7(a)  The  packed  red-clay-silt  soil  field  used  to 
generate  the  DE  regions,  and  7(d)  the  subsequent  camouflage  and  raking  of  the  area. 


(a)  (b)  (c)  (d) 

Fig.  8.  The  polarimetric  image  set  for  August  27,  2009  test  where  the  DE  regions  are  shown  by 
the  identifying  arrows,  (a)  Conventional  LWIR  thermal  image  SO;  (b)  Stokes  image  SI;  (c) 
Stokes  image  S2;  and  (d)  DoLP  image,  recorded  with  the  microbolometer  based  SAR 
polarimetric  sensor. 
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Figure  8  shows  the  resultant  thermal  and  polarimetric  image  set  recorded  at  the  red-clay- 
silt  site  on  August  27,  2009.  Two  large  DE  regions  are  visible  only  in  the  Stokes  and  DoLP 
images  shown  in  Figs.  8(b)-8(d)  and  are  identifiable  as  either  light  regions  in  image,  8(b),  or 
dark  regions  in  images  8(c)  and  8(d).  Average  thermal  and  polarimetric  values  for  the 
disturbed  and  background  ROIs  are  shown  in  Table  3. 

Review  of  the  values  shown  in  Table  3  again  shows  negative  values  for  the  normalized  SI 
image,  which  implies  that  the  polarization  state  results  primarily  from  surface  emission,  rather 
than  reflection  of  the  ambient  background  radiation.  Because  the  camera  was  well -centered 
with  respect  to  the  disturbed  regions,  greater  symmetry  was  created  within  the  scene,  resulting 
in  normalized  S2  values  that  were  effectively  zero.  As  with  the  earlier  test,  the  disturbed 
region  was  less  polarized  than  the  undisturbed  background,  resulting  in  DoLP  values  of 
1.00%  and  1.43%,  respectively. 

The  results  of  the  ROC  curve  and  contrast  parameter  analysis  are  shown  Table  4  and 
indicate  that  the  DoLP  image  is  ranked  highest  in  detectability,  followed  closely  by  the  SI 
image.  However,  issues  associated  with  the  ROC  curve  approach  become  apparent  when 
considering  the  S2  image,  which  registered  the  lowest  probability  of  detection  with  a  value  of 
0.312,  although  the  DE  regions  are  clearly  visible  in  the  S2  image,  shown  Fig.  8(c). 


Table  3.  Average  radiant  and  polarimetric  values  for  the  DE  and  background  regions 
shown  in  Figs.  8(a)-8(b)  for  the  red-clay-silt  test  site  recorded  on  August  27,  2009. 


Aug.  27,  2009  Test 

SO  (watt/sr-m2) 

S1/S0 

S2/S0 

DoLP(%) 

DE  region  (ROI  average) 

16.41 

-0.001 

-0.002 

1.0 

background  (ROI  average) 

15.46 

-0.014 

0.004 

1.4 

Table  4.  Comparison  of  the  contrast  parameter  and  ROC  curve  results  for  the  red-clay- 
soil  surfaces  shown  in  Fig.  7(a)-7(d),  recorded  on  August  27,  2009. 


Aug.  27,  2009  Test 

SO 

SI 

S2 

DoLP 

contrast  parameter 

0.023 

1.438 

1.022 

1.549 

probability  of  detection 

0.444 

0.478 

0.312 

0.795 

On  September  3,  2009,  we  returned  to  the  first  test-site  location  of  May  22,  2008,  and 
repeated  the  measurement  with  the  microbolometer-based  SAR  polarimetric  sensor.  This 
time,  two  surrogate  targets  were  buried,  and  after  thermal  equilibrium  between  the  DE  and 
backgrounds  surfaces  was  reached,  polarimetric  images  sets  were  recorded,  see  Fig.  9.  Note 
that  the  bright  spots  seen  in  the  SO  image,  Fig.  9(a),  was  a  result  of  localized  heating  due  to 
direct  sunlight  that  was  filtered  through  a  series  of  trees  located  on  the  right  side  of  the  test 
site. 
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(a)  (b)  (c)  (d) 

Fig.  9.  Radiant  and  polarimetric  imagery  recorded  on  Sept.  3,  2009  at  the  original  May  22, 

2008  site.  9(a)  Conventional  thermal  LWIR  image,  SO,  9(b)  Stokes  image,  SI,  9(c)  Stokes 
image,  S2,  and  9(d)  DoLP  image,  recorded  using  a  microbolometer  based  SAR  imaging 
polarimeter. 

On  September  4,  2009,  we  chose  a  test  location  containing  a  topsoil  mixture  consisting  of 
fine  dirt  and  gravel  that  was  “shadowed”  from  the  afternoon  sun  by  a  large  building,  see  Fig. 
10(a).  This  location  was  specifically  chosen  to  assess  how  shadowing,  as  well  as  radiant 
loading  resulting  from  the  building,  would  affect  the  ability  to  polarimetrically  resolve  regions 
of  DE.  Only  one  surrogate  target  was  buried  for  the  test,  and  after  a  period  of  about  2  h 
(needed  for  thermal  equilibrium  to  occur),  recording  of  polarimetric  imagery  was  started,  see 
Fig.  10. 


(a)  (b)  (c)  (d) 

Fig.  10.  Radiant  and  polarimetric  imagery  recorded  on  Sept.  4,  2009  showing  the  effect  of 
shadowing  and  ambient  radiant  loading  on  a  DE  region.  Target  site  consisted  of  a  topsoil 
mixture  of  fine  dirt  and  gravel,  (a)  Conventional  thermal  LWIR  image  SO;  (b)  Stokes  image 
SI;  (c)  Stokes  image  S2;  and  (d)  DoLP  image,  recorded  using  the  microbolometer  based  SAR 
imaging  polarimeter. 

Table  5.  Average  radiant  and  polarimetric  values  for  DE  and  background  ROIs  recorded 
on  September  3  -4, 2009  for  the  image  sets  shown  in  Figs.  9  and  10  using  the  LWIR 
microbolometer  SAR  polarimetric  sensor. 


Sept.  3,  2009  Test 

SO  (watt/sr-m2) 

S1/S0 

S2/S0 

DoLP(%) 

DE  region  (ROI  average) 

13.13 

-0.010 

-0.003 

1.1 

background  (ROI  average) 

13.13 

-0.016 

-0.006 

1.7 

Sept.  4,  2009  Test 

SO  (watt/sr-m2) 

S1/S0 

S2/S0 

DoLP(%) 

DE  region  (ROI  average) 

13.61 

-0.005 

0.001 

0.6 

background  (ROI  average) 

13.26 

-0.009 

-0.006 

1.1 

Until  now,  all  of  the  studies  have  been  short  in  duration — i.e.,  a  single  day — and  the 
obvious  question  is,  “How  long  and  under  what  conditions  will  such  disturbances  continue  to 
be  detectable  using  a  polarimetric  imaging  method?” 

The  resultant  radiance  and  polarimetric  values  for  the  DE  and  background  ROIs,  as  well 
as  the  corresponding  detection  metrics  for  the  September  3-4,  2009,  tests  are  shown  in  Tables 
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5  and  6.  Review  of  the  contrast  parameter  and  ROC  curve  results  shows  that  the  degree  of 
detectability  for  the  polarimetric  derived  imagery  is  consistently  greater  than  the  conventional 
thermal  image,  SO.  However,  there  is  disagreement  on  whether  the  SI  or  DoLP  image  is 
actually  superior,  since  the  September  3  contrast  parameter  implies  the  DoLP  image  to  be 
best,  while  the  ROC  curve  results  for  both  September  3  and  4  show  the  SI  to  be  the  superior 
detection  image.  Again,  the  dilemma  of  what  is  the  “best”  image  is  left  to  the  observer,  and 
after  comparing  the  image  sets  shown  in  Figs.  9  and  10,  one  could  make  a  good  argument  for 
either  the  SI  or  DoLP  for  being  the  superior  detection  image. 

To  address  this  issue  partially,  we  conducted  a  multi-day  field  test  that  occurred  over  a 
five-day  period  spanning  October  1-5,  201 1.  The  term  “partially”  is  used  because  even  after  a 
week  in  the  field,  in  which  we  experienced  a  variety  of  weathering  conditions  including  a 
series  of  modest  rain  events  and  monsoon  type  winds,  many  of  the  DE  regions  were  still 
visible  to  the  polarimetric  sensor. 

Table  6.  Comparison  of  the  contrast  parameter  and  ROC  curve  results  for  the  DE  tests 
recorded  on  September  3-4,  2009  for  the  image  sets  shown  in  Figs.  9  and  10  using  the 
LWIR  microbolometer  SAR  polarimetric  sensor. 


Sept.  3,  2009  Test 

SO 

SI 

S2 

DoLP 

contrast  parameter 

0.082 

0.775 

0.744 

0.913 

probability  of  detection 

0.394 

0.754 

0.522 

0.248 

Sept.  4,  2009  Test 

SO 

SI 

S2 

DoLP 

contrast  parameter 

0.0210 

1.569 

0.594 

1.473 

probability  of  detection 

0.399 

0.613 

0.678 

0.397 

This  final  test  was  conducted  in  an  arid  southwest  region  of  the  United  States  located  at 
the  Energetic  Materials  Research  and  Testing  Center  (EMRTC)  in  Socorro,  NM,  where  the 
soil  type  is  characterized  as  loam — i.e.,  a  soil  composed  of  sand,  silt,  and  clay  at  about  40-40- 
20%  concentrations  respectively.  For  this  particular  study,  our  most  sensitive  LWIR  MCT- 
based  polarimetric  sensor  became  available.  The  640  LWIR  SAR  Polarimetric  imager, 
produced  by  Polaris  Sensor  Technologies,  housed  a  Stirling-cooled  640x480  MCT  FPA 
detector,  which  was  provided  by  DRS  Technologies,  Inc.  The  system  was  designed  for 
maximum  radiometric  throughput  and  sensitivity  that  results  from  efficient  sensor  design,  and 
a  usually  wide  spectral  response  of  7.5  to  1 1.1pm,  for  the  FPA. 

The  final  test  was  held  in  conjunction  with  another  experiment  in  which  a  vehicle - 
mounted,  forward-looking-ground-penetrating-radar  (FLGPR)  was  being  evaluated.  In  order 
to  get  quasi-registered  LWIR  polarimetric  imagery  with  the  FLGPR  system,  the  polarimetric 
sensor  was  mounted  on  an  elevated  platform  located  above  the  FLGPR  transmitter/receiver. 
This  resulted  in  the  sensor  being  4.5  m  above  the  ground,  as  shown  in  Fig.  11(a).  The 
polarimetric  sensor-FLGPR  platform  was  tilted  downward  at  an  angle  of  24°  with  respect  to 
the  LOS  and  test  surface.  A  variety  of  surrogate  objects  were  buried  and  camouflaged  by  hand 
in  one  of  three  test  lanes,  see  Figs.  1 1(b)  and  1 1(c).  A  series  of  five  DE  regions  were  imaged 
every  other  day  during  the  period  of  October  1-5,  2011,  in  which  meteorological  conditions 
varied  greatly.  Weather  conditions  for  each  of  the  three  days  were  as  follows:  October  1 — 
clear  sky,  low  relative  humidity  with  a  temperatures  range  of  69-78  °F;  October  3 — overcast 
with,  low  relative  humidity,  temperatures  slightly  cooler  in  the  range  of  65-73°F;  and  October 
5 — ground  surface  damp  as  a  result  of  a  rain  storm  that  occurred  during  the  prior  day  and 
much  cooler,  with  a  temperature  range  of  50-60°F. 
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The  effects  of  weathering  are  displayed  in  a  series  of  images  shown  in  Figs.  12(a)- 12(c). 
Each  column  displays  a  visible  image,  a  conventional  LWIR  thermal  image,  SO,  the  Stokes 
image,  SI,  and  a  DoLP  image  for  the  DE  target  2  (TG  2).  Tabulated  radiant  and  polarimetric 
values  for  all  DE  targets  (TG)  1-5,  recorded  from  October  1-5,  2011,  as  well  as  their 
corresponding  background  (BG)  regions,  are  shown  in  Table  7.  A  similar  list  of  tabulated 
ROC  curve  and  contrast  parameter  values  for  the  same  period  is  shown  in  Table  8. 


(a)  (b)  (c) 

Fig.  11.  (a)  The  FLGPR  vehicle  platform  in  which  the  640x480  SAR  polarimetric  sensor  was 
mounted  and  positioned  at  an  angle  of  24  degrees  with  respect  to  the  LOS  and  the  target 
surface,  (b)  Shows  a  typical  surrogate  IED  being  buried  in  the  arid  desert  soil,  (c)  A  typical  DE 
region  after  burial. 


Visible 


LWIR 

fheniid 


DOLP 


(a)  Oct.  1,2011  (b)  Oct.  3,  2011  (c)  Oct.  5,  201 1 

Fig.  12.  Typical  evolution  of  visible,  thermal,  and  polarimetric  signatures  for  a  DE  region  over 
the  five  day  period  from  Oct.  1-5,  2011,  where  the  DE  region  is  identified  by  the  identifying 
arrow,  (a)  Target  2  recorded  on  Oct.  1,  201 1  under  a  clear  sky  condition,  (b)  Target  2  recorded 
on  Oct.  3,  201 1,  during  dense  overcast  conditions,  (c)  Target  2  recorded  on  Oct.  5,  201 1,  under 
light  cloud  cover  and  after  a  moderate  rain  event  that  occurred  on  Oct.  4,  20011 . 
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The  overall  trend  seen  in  the  2009  and  2010  studies  continues,  and  is  reflected  in  the 
values  shown  in  Table  7.  Specifically,  1)  the  polarization  state  for  the  SI  images  continue  to 
result  from  emission  dominant  radiance;  2)  the  DoLP  for  DE  regions  is  always  less  than  the 
undisturbed  background  surfaces;  and  3)  the  highest  ranked  image  type  for  detectability 
continued  to  be  the  SI  image,  followed  closely  by  the  DoLP  images.  The  values  for  the 
normalized  SI  images  record  on  October  3  (dense  cloud  cover)  were,  on  average,  slightly 
lower  than  SI  values  recorded  on  either  October  1  or  October  5. 


Table  7.  Radiance  and  polarimetric  values  for  DE  targets  (TG)l-5  and  corresponding 
background  (BG)  regions. 


10/1/11 

TGI 

BG  1 

TG  2 

BG  2 

TG  3 

BG  3 

TG  4 

BG  4 

TG  5 

BG  5 

so 

(W/cm2- 

sr) 

35.84 

35.29 

36.64 

35.75 

35.64 

34.63 

37.66 

37.19 

36.86 

36.34 

S1/S0 

-0.00 

7 

-0.01 

3 

-0.00 

8 

-0.01 

2 

-0.00 

7 

-0.01 

7 

-0.00 

8 

-0.01 

2 

-0.00 

7 

-0.01 

1 

S2/S0 

0.000 

0.000 

0.000 

0.000 

0.000 

0.001 

0.000 

0.000 

0.000 

0.000 

DoLP( 

%) 

0.67 

1.35 

0.81 

1.15 

0.71 

1.77 

0.84 

1.25 

0.71 

1.11 

10/3/11 

TGI 

BG  1 

TG  2 

BG  2 

TG  3 

BG  3 

TG  4 

BG  4 

TG  5 

BG  5 

SO 

(W/cm2- 

sr) 

22.62 

22.16 

23.73 

23.56 

22.78 

22.22 

23.53 

23.01 

22.76 

22.28 

S1/S0 

-0.00 

5 

-0.00 

8 

-0.00 

5 

-0.00 

9 

-0.00 

6 

-0.01 

0 

-0.00 

6 

-0.00 

9 

-0.00 

5 

-0.00 

8 

S2/S0 

0.000 

0.000 

0.000 

0.001 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 

DoLP( 

%) 

0.56 

0.83 

0.55 

0.95 

0.57 

1.07 

0.57 

0.86 

0.52 

0.83 

10/5/11 

TGI 

BG  1 

TG  2 

BG  2 

TG  3 

BG  3 

TG  4 

BG  4 

TG  5 

BG  5 

SO 

(W/cm2- 

sr) 

23.42 

23.34 

22.69 

23.72 

21.56 

21.96 

21.86 

22.34 

22.27 

22.94 

S1/S0 

-0.01 

4 

-0.01 

8 

-0.01 

3 

-0.01 

8 

-0.01 

4 

-0.02 

3 

-0.01 

5 

-0.01 

8 

-0.01 

4 

-0.02 

1 

S2/S0 

0.001 

0.001 

0.001 

0.002 

0.002 

0.003 

0.002 

0.002 

0.001 

0.002 

DoLP( 

%) 

1.45 

1.86 

1.38 

1.87 

1.47 

2.38 

1.52 

1.87 

1.44 

2.16 

However,  based  on  our  prior  experience,  we  would  have  expected  even  lower  SI  values 
for  the  October  3  test  data,  considering  the  overcast  conditions,  which  is  often  associated  with 
large  ambient  radiant  loading.  This  is  because  the  linear  polarization  that  occurs  when 
ambient  radiance  is  reflected  from  a  surface  is  always  orthogonal  to  the  linear  polarization 
that  results  purely  from  emission,  and  the  net  effect  results  in  an  overall  reduction  in  the  total 
linear  polarization  exhibited  by  the  surface.  This  effect  is  most  often  seen  in  MidIR 
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polarimetry,  where  it  is  not  uncommon  to  see  the  sign  of  various  regions  within  an  S 1  image 
flip  with  quickly  changing  meteorological  conditions  [32]. 


Table  8.  ROC  curve  and  contrast  parameter  (CP)  values  for  DE  regions  1-5  recorded  on 

Oct.  1-5,2011. 


10/1/2011 

1  ROC 

1  CP 

2  ROC 

2  CP 

3  ROC 

3  CP 

4  ROC 

4  CP 

5  ROC 

5  CP 

SO 

0.633 

0.843 

0.53 

0.179 

0.587 

1.301 

0.655 

0.831 

0.646 

0.211 

SI 

0.828 

0.889 

0.746 

0.641 

0.638 

1.067 

0.865 

1.250 

0.874 

0.881 

S2 

0.494 

0.045 

0.48 

0.119 

0.476 

0.248 

0.399 

0.307 

0.431 

0.232 

DoLP 

0.201 

0.867 

0.296 

0.621 

0.418 

1.080 

0.167 

1.227 

0.114 

0.876 

10/3/2011 

1  ROC 

1  CP 

2  ROC 

2  CP 

3  ROC 

3  CP 

4  ROC 

4  CP 

5  ROC 

5  CP 

SO 

0.718 

0.904 

0.801 

1.302 

0.742 

0.799 

0.718 

0.707 

0.493 

0.195 

SI 

0.808 

0.952 

0.853 

1.682 

0.73 

1.543 

0.808 

1.610 

0.791 

1.065 

S2 

0.495 

0.013 

0.439 

0.472 

0.551 

0.050 

0.495 

0.209 

0.543 

0.102 

DoLP 

0.278 

0.786 

0.198 

1.655 

0.419 

1.501 

0.278 

1.559 

0.582 

1.045 

10/5/2011 

1  ROC 

1  CP 

2  ROC 

2  CP 

3  ROC 

3  CP 

4  ROC 

4  CP 

5  ROC 

5  CP 

SO 

0.541 

0.459 

0.452 

0.304 

0.504 

0.556 

0.541 

0.713 

0.662 

0.435 

SI 

0.896 

1.255 

0.754 

0.583 

0.693 

0.880 

0.896 

2.518 

0.851 

1.162 

S2 

0.489 

0.311 

0.53 

0.066 

0.519 

0.045 

0.489 

0.427 

0.432 

0.331 

DoLP 

0.162 

1.236 

0.303 

0.576 

0.408 

0.844 

0.162 

2.450 

0.515 

1.166 

Perhaps  the  most  interesting  aspect  of  the  study  resulted  after  the  rain  event  that  occurred 
on  October  4.  Although  not  readily  apparent  in  the  image  set  shown  in  12c,  the  modest 
rainfall  of  October  4  appeared  to  actually  improve  the  contrast  between  the  DE  and 
surrounding  undisturbed  areas.  We  have  always  expected  that  after  a  sufficient  amount  of 
weathering  and/or  traffic  has  occurred,  the  ability  to  polarimetrically  detect  regions  of 
recently  disturbed  soil  would  not  be  possible.  However,  based  on  these  preliminary  results, 
modest  weathering  events  like  blowing  wind  and  rain  may  actually  serve  to  enhance  the 
effect,  at  least  initially.  In  this  particular  case,  the  rain  events  appear  to  produce  a  net 
“smoothing”  of  the  undisturbed  regions,  while  the  DE  regions  were  far  less  affected.  This  is 
also  reflected  by  the  relatively  large  DoLP  values  recorded  on  October  5  for  both  the  target 
(TG)  and  background  (BG)  surfaces  shown  in  Table  7. 

5.  Conclusion 

We  have  shown  that  by  using  passive  LWIR  polarimetric  imaging,  one  can  improve  the 
ability  to  remotely  detect  localized  regions  of  recently  DE  that  is  often  associated  with  the 
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bearing  of  landmines  and  IEDs.  The  results  stem  from  a  series  of  tests  conducted  at  a  variety 
of  different  geographic  locations  with  varying  soil  types  in  which  three  different  LWIR 
polarimetric  imaging  platforms  were  used.  In  light  of  the  multitude  of  changing  parameters 
and  sensors,  the  final  results  were  surprisingly  consistent. 

First,  based  on  objective  detection  metrics  used  in  this  study — i.e.,  ROC  curve  analysis 
and  standardized  contrast  parameter  calculations — the  ability  to  detect  localized  regions  of 
DE  was  greatest  for  the  polarimetric  images  SI  and  DoLP.  We  believe  the  DE  contrast  seen  in 
the  Stokes  images  SI  is  a  direct  result  of  the  symmetric  slant-path  imaging  arrangement  used 
in  the  study.  We  expect  for  an  elevated  nadir  type  detection  arrangement  that  the  DoLP  image 
would  be  the  image  type  of  choice  for  detecting  regions  of  DE. 

Second,  all  measured  SI  signatures  were  due  to  emission  induced  polarization — i.e., 
linearly  polarized  in  the  horizontal  plane.  As  mentioned  earlier,  situations  in  which  the 
ambient  optical  background  is  changing,  as  is  the  case  when  stratus  or  nimbostratus  cloud- 
cover  is  present,  the  sign  of  specific  regions  in  the  Stokes  images  SI  or  S2  are  often  observed 
changing,  signifying  that  the  mechanism  for  generating  linear  polarization  has  switched  from 
being  emission  to  reflection  dominant,  or  vice  versa.  However,  observations  also  show  that 
even  during  these  events,  the  polarimetric  contrast  between  a  given  target  and  the 
corresponding  background  is  preserved  [32] 

Finally,  the  polarimetric  contrast  necessary  to  distinguish  DE  regions  from  the  undisturbed 
surrounding  areas  results  from  the  fact  that  undisturbed  surfaces  tend  to  exhibit  higher  degrees 
of  linear  polarization  compared  to  DE  areas.  Put  more  generally,  polarimetric  contrast 
between  disturbed  and  undisturbed  surface  regions  arises  when  symmetry  of  the  surface  is 
altered.  Such  symmetry  may  result  from  naturally  occurring  events,  e.g.,  prolong  wind  and 
rain  storms,  or  by  a  manmade  process  associated  with  vehicular  and  pedestrian  travel.  Once  a 
soil  surface  is  altered,  very  subtle,  yet  quite  measurable,  differences  in  the  polarization  state  of 
the  reflected  or  emitted  radiation  occurs  at  the  boundary  that  defines  disturbed  from 
undisturbed  surface  areas. 

We  have  shown  for  the  cases  considered  here  a  net  reduction  in  the  linear  polarization  for 
the  DE  regions  (relative  to  the  surround  undisturbed  regions)  on  the  order  of  20-100%  or 
more.  Although  all  of  the  imagery  recorded  involved  a  slant -path  LOS,  the  authors  believe  the 
ability  to  detect  regions  of  DE  using  passive  LWIR  polarimetric  imaging  would  greatly 
benefit  if  conducted  from  an  aerial  platform.  This  would  allow  for  imaging  of  much  larger 
surface  areas  in  which  a  “change -detection”  method  could  be  applied.  We  are  currently 
planning  a  series  of  such  studies  using  a  nadir  LOS  where  DE  regions  are  polarimetrically 
imaged  in  both  the  MidIR  and  LWIR  to  assess  the  benefit  for  using  a  dual -band  approach. 


Design  of  220  GHz  Electronically  Scanned 
Reflectarrays  for  Confocal  Imaging  Systems 

Abigail  S.  Hedden,  Charles  R.  Dietlein  and  David  A.  Wikner 


Optical  Engineering  2012,  51  (9),  10.  [DOI:  10.1117/1.0E.51.9.091611] 


46 


Design  of  220  GHz  electronically  scanned  reflectarrays  for 
confocal  imaging  systems 


Abigail  S.  Hedden 
Charles  R.  Dietlein 
David  A.  Wikner 

U.S.  Army  Research  Laboratory 
2800  Powder  Mill  Road 
Adelphi,  Maryland  20783 
E-mail:  abigail.s.hedden.civ@mail.mil 


Abstract.  The  authors  analyze  properties  of  a  220  GHz  imaging  system 
that  uses  a  scanned  reflectarray  to  perform  electronic  beam  scanning  of  a 
confocal  imager  for  applications  including  imaging  meter-sized  fields  of 
view  at  50  m  standoff.  Designs  incorporating  reflectarrays  with  confocal 
imagers  have  not  been  examined  previously  at  these  frequencies.  We 
examine  tradeoffs  between  array  size,  overall  system  size,  and  number 
of  achievable  image  pixels  resulting  in  a  realistic  architecture  capable 
of  meeting  the  needs  of  our  application.  Impacts  to  imaging  performance 
are  assessed  through  encircled  energy  calculations,  beam  pointing  accu¬ 
racy,  and  examining  the  number  and  intensity  of  quantization  lobes  that 
appear  over  the  scan  ranges  of  interest.  Over  the  desired  scan  range, 
arrays  with  1  and  2-bit  phase  quantization  showed  similar  array  main 
beam  energy  efficiencies.  Two-bit  phase  quantization  is  advantageous 
in  terms  of  pointing  angle  error,  resulting  in  errors  of  at  most  15%  of 
the  diffraction-limited  beam  size.  However,  both  phase  quantization 
cases  considered  resulted  in  spurious  returns  over  the  scan  range  of 
interest  and  other  array  layouts  should  be  examined  to  eliminate  potential 
imaging  artifacts.  ©  2012  Society  of  Photo-Optical  Instrumentation  Engineers  (SPIE). 
[DOI:  10.1 1 17/1  .OE.51 .9.09161 1] 
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1  Introduction 

Detection  of  concealed  explosives  and  other  devices  easily 
masked  by  clothing  is  a  significant  application  of  passive 
and  active  millimeter- wave/terahertz  imaging  systems.1 
Developing  methods  for  detection  at  long  ranges  is  crucial 
for  effectively  dealing  with  this  threat.  For  applications 
involving  mobile  targets,  imaging  large  fields  of  view  at 
near  video  frame  rates  is  needed.  A  common  approach 
implemented  in  state-of-the-art  millimeter-wave  and  tera¬ 
hertz  systems  uses  mechanical  scanners  for  rapid  beam  steer¬ 
ing.  A  drawback  of  mechanical  scanning  is  the  associated 
hardware,2  which  can  be  physically  large,  heavy,  and 
cause  vibrations.  Many  radar  applications,  unlike  passive 
millimeter-wave  imaging,  do  not  need  the  large  bandwidth 
provided  by  scanning  mirrors.  In  this  context,  rapidly  rotat¬ 
ing  and  oscillating  elements  are  often  a  hindrance.  Develop¬ 
ing  electronic  scanning  capabilities  in  the  millimeter-wave 
and  terahertz  regimes  avoids  these  downsides,  which  yields 
robust,  portable,  and  lightweight  systems. 

Phased  array  technology  and  electronic  beam  scanning 
techniques  are  well  developed  at  frequencies  below  100  GHz. 
Extending  this  technology  to  the  upper  millimeter-wave/ 
terahertz  regime  is  an  active  area  of  research.  _5  Phase  shif¬ 
ters  have  already  been  demonstrated  at  frequencies  above 
200  GHz6  and  low-loss  transmission  line  structures  and  ter¬ 
ahertz  integrated  circuits  have  been  demonstrated  at  frequen¬ 
cies  beyond  600  GHz.7  Although  this  technology  is  in  the 
early  stages  of  development,  it  is  paving  the  way  for  future 
high  frequency  phased  arrays. 
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To  the  authors’  knowledge,  designs  incorporating  reflectar¬ 
rays  with  confocal  reflector  systems  have  not  been  explored  at 
frequencies  as  high  as  220  GHz.  In  this  work,  we  analyze  an 
architecture  that  uses  a  reflection-type  phased  array,  or  scan¬ 
ning  reflectarray,  to  perform  electronic  beam  scanning  of  a 
confocal  imager  to  image  humans  at  standoff  distances  of 
50  m.  This  architecture  is  particularly  useful  for  applications 
that  simultaneously  require  a  high  degree  of  lateral  resolution 
and  modest  bandwidth  (several  percent),  such  as  radar  ima¬ 
ging.  We  concentrate,  in  particular,  on  characteristics  of  the 
reflectarray  that  impact  design,  for  example,  its  linear  dimen¬ 
sions,  the  number  of  elements,  element  spacing,  and  the  num¬ 
ber  of  phase  quantization  levels.  Section  2  describes  the 
system  architecture  and  the  constraints  on  the  imager  that 
result  from  our  desired  application.  Results  of  a  geometrical 
optics  analysis  are  presented,  and  the  implications  for  electro¬ 
nic  scanning  are  considered.  In  Sec.  3,  uniform  lattice  arrays 
compatible  with  the  imager  architecture  are  used  to  analyze 
properties  including  beam  symmetry  and  encircled  energy 
as  a  function  of  scan  angle  and  array  element  spacing. 
Phase  quantization  is  considered  for  two-dimensional  (2-D) 
rectangular  lattice  arrays,  and  impacts  to  image  quality  includ¬ 
ing  beam  pointing  error  and  the  relative  number  and  intensity 
of  quantization  lobes  are  analyzed  for  arrays  with  1-  and  2-bit 
phase  shifters.  A  brief  study  of  reflectarray  element  design  and 
implementation  is  also  presented. 

2  Confocal  Imager 

2.1  Basic  Constraints 

For  imaging  humans  in  meter-sized  fields  of  view  at  standoff 
distances  of  approximately  50  m,  centimeter-scale  lateral 
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resolution  is  desirable.  However,  apertures  larger  than  1  m 
are  required  to  achieve  diffraction-limited  spot  sizes  of 
5  cm  or  less  at  frequencies  below  about  400  GHz.  Since 
many  standoff  imaging  applications  are  not  well  suited  to 
bulky,  heavy  optics,  this  problem  can  be  solved  by  operating 
at  higher  frequencies  (>400  GHz)  with  smaller  apertures. 
Although  component  technology  is  not  well  developed  at 
these  frequencies,  recent  and  ongoing  programs5  make  the 
development  of  electronic  scanning  capabilities  above 
400  GHz  an  attractive  consideration  for  future  efforts. 

In  this  work,  we  investigate  the  impact  of  electronic  scan¬ 
ning  on  system  design  by  considering  a  system  that  operates 
in  the  220  GHz  atmospheric  window.  We  selected  this 
frequency  as  a  compromise  between  aperture  diameter,  dif¬ 
fraction-limited  spot  size,  component  availability,  signal 
attenuation,  and  the  transmission  and  reflectivity  character¬ 
istics  of  targets.  Compared  with  higher  frequency  operation, 
220  GHz  offers  increased  transmission  through  clothing1,8 
and  more  mature  component  technology. 

For  our  design,  we  chose  a  confocal  design  for  its  ability 
to  scan  the  output  beam  of  a  large  aperture  using  a  relatively 
small  scanning  element.  Confocal  reflector  systems  have 
been  demonstrated  with  mechanically  scanned  terahertz  ima¬ 
gers9,10  for  concealed  weapons  detection  and  were  proposed 
for  use  with  phased  arrays  for  satellite-based  applications.11 
Additionally,  they  can  be  implemented  without  obstructing 
the  primary  aperture,  accommodate  a  fully-illuminated  aper¬ 
ture  for  high  resolution,  and  be  scanned  up  to  several  degrees 
off  axis  without  significant  vignetting  of  the  scanned  beam 
by  the  primary.  Subsequently,  we  refer  to  this  as  spillover. 


2.2  System  Architecture 

Figure  1  is  a  schematic  representation  of  our  design  along 
with  an  equivalent  simple  lens  model  of  the  reflector  system. 
A  paraboloidal  feed  reflector  illuminates  the  scanning  reflec- 
tarray  with  a  plane  wave.  The  diameter  of  the  active  area  of 
the  array  is  dA.  The  reflectarray  scans  the  beam  ±02  over  an 
under-illuminated  secondary  reflector  (element  2).  The 
primary  and  secondary  reflectors,  with  apertures  and  focal 
lengths  of  fl9  Dx,  and  f2,D2,  respectively,  share  a  common 
focus  and  are  designed  so  that  the  primary  Dx  is  fully  filled 


Array  Elements  Element  1 


(a)  tb) 

Fig.  1  (a)  Diagram  of  the  layout  of  a  scanned  reflect-array  incorpo¬ 
rated  into  a  confocal  reflector  system.  The  array  scans  the  beam 
over  an  under-filled  secondary  reflector,  thereby  steering  the  beam 
at  the  primary.  The  primary  element  is  fully  filled  over  a  scan 
range  of  (b)  An  equivalent  lens  model  of  the  system  showing 
the  array  and  two  focusing  elements. 


by  the  array  beam  with  negligible  spillover  as  the  beam  is 
scanned  off  axis  by  ±0h 

We  use  geometrical  optics  to  obtain  rough  system  dimen¬ 
sions  and  properties  of  the  elements  shown  in  Fig.  1.  As  we 
show,  the  design  is  sensitive  to  dA.  Although  increasing  dA 
narrows  the  width  of  the  scanned  beam  in  object  space,  it 
also  increases  the  length  of  the  optical  system.  Our  goal 
is  to  design  a  system  that  meets  our  scanning  criteria  but 
is  moderately  sized. 

To  balance  the  physical  size  of  the  system  and  its  spatial 
resolution,  we  set  Dx  =  1  m.  At  220  GHz,  this  yields  a 
diffraction-limited  spot  size  of  1.66  mrad,  which  in  turn 
yields  a  spot  size  of  8.3  cm  at  50  m.  Because  we  fix  the 
aperture  diameter  of  the  primary  to  1  m,  we  can  control  sys¬ 
tem  size  by  controlling  its  total  length, 

/tot  5*=  S  +f2  +/l-  (1) 


Although  the  optical  design  could  be  made  more  compact, 
the  total  focal  length  provides  a  sense  of  its  overall  size. 

In  order  to  fill  the  primary  aperture  without  significant 
spillover  as  the  array  scans  the  beam  off  axis,  element  2 
maps  the  array  aperture  distribution  onto  the  element  1  aper¬ 
ture.  From  the  thin  lens  formula,  the  distance  s  is  related  to 
the  focal  lengths  of  the  elements  according  to 


1  _  1  1 

7>s+/i+/2‘ 


(2) 


Furthermore  system  magnification  is  given  by 


M  =  zdi±IA_ 

s 


(3) 


Note  that  the  image  of  the  reflectarray  is  inverted  in  the 
aperture  of  the  primary,  which  indicates  M  is  negative  in 
our  system.  Equation  (3)  shows  that  if  f2  and  s  are  fixed, 
then  fi  grows  with  magnification.  Therefore,  from  Eq.  (1) 
it  is  obvious  that  the  entire  system  grows  as  magnification 
increases. 

From  an  alternative  definition  of  system  magnification, 


tan  02 
tan  0l 


(4) 


We  note  that,  for  fixed  Du  \M\  decreases  as  the  reflectarray 
increases  in  size.  If  we  assume  the  spacing  between  array 
elements  is  fixed  at  2/2,  increasing  dA  increases  the  number 
of  elements  that  contribute  to  a  scanned  beam,  which 
increases  the  beam  efficiency.4  This  tradeoff  is  reflected  in 
Fig.  2,  which  represents  the  relationship  between  reflectarray 
size  and  both  system  magnification  and  reflectarray  ele¬ 
ments.  Magnification  is  representative  of  system  length, 
whereas  number  of  elements  is  an  indication  of  beam  quality. 
In  the  remainder  of  this  work,  we  consider  reflectarray 
diameters  compatible  with  common  wafer  sizes  from  2  inch 
to  6  inch  (~5  cm  to  16  cm). 

In  addition  to  illuminating  the  primary  aperture  fully,  we 
must  ensure  that  element  2  is  large  enough  to  accept  the 
beam  from  the  reflectarray  as  it  scans  over  a  range  ±02. 
Thus, 


D2  >  dA  -f-  2 ex. 


(5) 
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Array  Size,  dA  [inch] 


dA  [cm] 

Fig.  2  System  magnification  and  reflectarray  elements  as  a  function 
of  reflectarray  size  expressed  in  inches  and  in  cm.  Magnification  is 
representative  of  system  length,  whereas  number  of  elements  is 
an  indication  of  beam  quality.  This  highlights  the  tradeoff  between 
overall  system  size  and  array  complexity. 


In  order  to  ensure  small  system  size,  we  assume  element  2  is 
an  //I  optic,  D2  =  f2,  which  constrains  dA , 

dA  </2[l  -2(1  -M)tan  6X\.  (6) 

One  can  rewrite  Eq.  (6)  using  Eqs.  (2)  through  (4)  to  indicate 
explicitly  the  limitations  the  design  imposes  on  the  scanner’s 
field  of  view  (FOV),  defined  by  scan  angle  6X, 


>  tan  6X . 


(7) 


Figure  3  illustrates  the  impact  of  this  limitation  on  system 
operation  assuming  D^lm  and  a  50  m  object  distance. 
If  total  FOV  is  approximately  26  h  the  largest  target  the 
system  can  scan  at  50  m  is  100  X  6X  m  for  6X  expressed 
in  radians.  We  plot  target  extent  in  Fig.  3  as  a  function  of 
array  diameter  with  system  /-number,  /sys,  as  a  parameter, 
where 


/sys  =  ftot/D 


1- 


(8) 


Fig.  3  Plot  of  field  of  regard  (expressed  as  both  target  size  and  num¬ 
ber  of  half-beamwidth  image  pixels)  as  a  function  of  reflectarray  size 
for  various  system  f-numbers,  (f- \  +  f2  +  s)/Dx  calculated  for  a  range 
of  50  m  and  D ,  =  1  m. 


generate  images  at  30  Hz  frame  rates  also  imposes  limits  on 
the  FOV.  Figure  3,  therefore,  also  indicates  the  number  of 
samples  the  system  generates  within  the  target  region  assum¬ 
ing  an  8.3  cm  beamwidth  and  half  beamwidth  sampling.  If 
we  assume  a  30  Hz  frame  rate,  aim  diffraction-limited  pri¬ 
mary  aperture,  and  a  2  m  target  region,  the  maximum  inte¬ 
gration  time  per  sample  is  15  ^s.  The  position  update  rates  of 
phase  shifters  used  to  scan  the  array  limit  the  field  of  view 
and,  given  that  10  yws  switch  times  have  been  demonstrated  at 
microwave  frequencies  with  variable  microelectromechani¬ 
cal  system  (MEMS)  capacitors,12  an  FOV  corresponding 
to  a  3  m  target  extent  is  the  limit  at  which  images  can  be 
captured  at  30  Hz  frame  rates. 

Based  on  our  analysis,  we  adopt  a  3 -inch-diameter  reflec¬ 
tarray  design.  Table  1  provides  an  overview  of  system  details 
assuming  dA  =7.5  cm  (3  inches),  Dx  =  1  m,  and  a  maxi¬ 
mum  desired  steering  angle  of  6X  =  ±1  deg.  This  yields 
M  =  13,  /sys  =2.3,  and  an  array  size  of  approximately  N  = 
6000  elements  (assuming  half  wavelength  spacing).  From 
Table  1,  the  overall  size  of  the  220  GHz  system  is  at  the 
edge  of  what  may  be  considered  portable.  However,  a  benefit 
of  the  design  is  its  potential  to  be  scaled  for  higher  frequency 
operation.  For  example,  a  system  operating  at  410  GHz 
with  Dx  =0.53  m  achieves  a  similar  diffraction-limited 
spot  size  to  the  one  shown  in  Table  1 .  If  the  array  diameter 
and  desired  total  scan  angle  remain  the  same,  ( dA  =7.5  cm 
and  6\  =  ±1  deg)  then  M  =  6.6,  /tot  =  90  cm,  and 
fia/Di  =  1.8  for  the  410  GHz  system. 


The  upper  left  portion  of  the  plot  corresponds  to  an 
unphysical  system  space  where  the  desired  target  region 
exceeds  the  maximum  achievable  scan  angle.  The  right,  ver¬ 
tical  axis  refers  to  the  number  of  pixels  in  an  image  for  a 
given  target  extent,  if  a  pixel  is  defined  at  every  half  beam- 
width.  Note  that,  at  small  array  diameters,  system  size  (indi¬ 
cated  by  /sys)  increases  rapidly  with  field  of  view.  For  a  fixed 
system  size,  mapping  large  fields  of  view  requires  large  array 
diameters.  It  is  worth  noting  that  tripling  the  array  diameter 
yields  only  a  modest  increase  in  FOV,  roughly  double, 
whereas,  from  Fig.  2,  the  complexity  of  the  array  increases 
exponentially  with  the  number  of  elements. 

Because  each  sample  within  the  FOV  requires  a  minimum 
integration  time  to  ensure  low  noise  detection,  our  desire  to 


3  2-D  Array  Architecture 

3.1  Uniform  Arrays:  Size  and  Element  Spacing 
Effects 

In  this  section  we  consider  the  properties  of  the  far-field 
pattern  generated  by  uniform  rectangular  lattice  arrays 
whose  size  and  element  spacing  are  compatible  with  the 
confocal  reflector  system  described  in  Sec.  2.2.  To  do  so, 
we  generate  the  complex  wave-amplitude  of  the  far-field  pat¬ 
tern  AF(6,  /),  referred  to  as  the  array  factor,  for  different 
scan  angles  and  array  layouts.  For  an  N  =  Ax  B  element 
array  with  dx  and  dy  element  spacing  along  orthogonal 
array  axes,  the  array  factor  is  given  by:13 
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Table  1  Details  of  a  confocal  reflector  system  incorporating  a  -3-inch  diameter  reflection-type  phased  array  including  approximate  sizes  and  focal 
lengths  of  optical  elements,  system  magnification,  and  desired  scan  angles. 


dA  (cm) 

s  (cm) 

e2  (deg) 

D2  (cm) 

f2  (cm) 

U  (cm) 

Di  (cm) 

M  (unitless) 

<?i  (deg) 

Target  (m) 

7.5 

16 

13 

15 

15 

200 

100 

13 

1 

1.7 

A 

AF (0,  (j))  =  exp \j(a  —  l)kdx  sin  6  cos  (j)  —  aXtQ] 

a- 1 
A 

•  ^  exp \j(b  -  1  )kdy  sin  6  sin  $  -  ayjb\, 
b- 1 

axa  =  (a  —  1  )kdx  sin  60  cos  </>0  and 

ay  b  =  (b  —  l)kdx  sin  60  sin  </>0,  (9) 

where  ax  a  and  av  b  are  the  phase  values  for  each  element 
given  a  desired  steering  angle  (0O,0O)  in  the  plane  of  the 
reflectarray. 

In  our  analysis,  we  consider  arrays  with  uniform  spacing 
2/2  or  larger.  Antenna  structures  and  the  physical  size  of 
high  frequency  phase  shifters3,6,14  make  half- wavelength 
spacing  difficult  to  achieve.  We  assume  the  arrays  consist 
of  identical  elements  with  equal  amplitudes  and,  initially, 
continuous  phase.  We  consider  the  effects  of  phase  quantiza¬ 
tion  in  Sec.  3.2. 

Table  2  shows  properties  of  arrays  with  different  element 
spacings,  scan  angles,  and  sizes  compatible  with  fabrication 
on  2-inch,  3 -inch,  and  6-inch  wafers.  All  arrays  shown  can 
be  used  with  the  confocal  reflector  system  in  Sec.  2.2.  The 
designs  in  rows  5  to  7  of  Table  2  are  consistent  with  the  sys¬ 
tem  described  in  Table  1.  Pattern  properties,  including  full- 
width  half-max  (FWHM)  values  of  array  main  beams,  were 
determined  from  two-dimensional  Gaussian  functions  fit  to 
the  far-field  power  pattern  |  AF(0,  <fi)  |2.  We  found  the  best  fit 
by  minimizing  the  sum  of  the  squares  of  the  errors  between 


the  generated  pattern  and  the  Gaussian  functions  iteratively. 
The  pattern  properties  extracted  from  Gaussian  fits  to  the 
main  beam  yield  far-field  characteristics  that  match  those 
predicted  by  analytical  formulae.  For  example,  consider 
the  data  presented  in  the  first  two  rows  of  Table  2.  As 
expected,  for  a  square  array  of  equally  spaced  elements, 
the  fitted  orthogonal  beamwidths  for  the  broadside  array 
(row  1)  are  the  same.  Also,  as  the  beam  is  steered  off  broad¬ 
side  to  15  deg,  the  fitted  width  in  the  dimension  parallel  to 
(j)  =  (j) 0  =  0  deg  grows  as  sec  <90,  while  the  width  in  the  per¬ 
pendicular  dimension  is  unaffected.  Comparing  the  far-field 
beamwidths  from  the  numerical  calculation  with  the  results 
from  analytic  expressions  that  are  valid  for  large  uniform 
rectangular  arrays  whose  maxima  are  not  steered  far  from 
broadside13  reveals  <3%  difference  between  the  FWHM 
values. 

For  uniform  arrays  with  element  spacings  greater  than 
2/2,  the  main  beam  FWHM  characteristics  are  largely  unaf¬ 
fected  by  element  spacing  for  a  given  scan  angle  since  each 
array  configuration  occupies  the  same  physical  area. 
Encircled  energy  calculations  are  sensitive  to  array  element 
spacing  and  provide  a  quantitative  way  of  assessing  the 
impact  of  grating  lobes  and  scan  angle  error  on  beam 
characteristics. 

The  encircled  energy  was  calculated  by  dividing  the  total 
energy  contained  within  the  FWHM  due  to  the  array  by 
the  total  energy  expected  for  a  solid  reflector,  Ea/Es ,  of 
the  same  size.  The  FWHM  of  the  beam  pattern  due  to  the 
solid  reflector  is  calculated  as  a  normalized,  Gaussian 
function  fit  to  the  Airy  pattern: 


Table  2  The  properties  of  different  uniform  arrays  are  shown  for  rectangular  lattice  arrays  with  various  layouts.  Properties  were  calculated  by  fitting 
Gaussian  functions  to  the  array  patterns.  E  is  the  percentage  of  energy  contained  within  an  angular  range  equal  to  the  FWHM  of  Ftheory. 


dA  (cm) 

size  ( M  x  N) 

spacing  (10) 

6>0  (deg) 

FWHM  (deg) 

E  (%) 

3.6 

51  x  51 

0.5 

0 

[1.96,  1.96] 

97 

3.6 

51  x  51 

0.5 

15 

[2.00,  1.94] 

90 

3.6 

36x36 

0.7 

15 

[2.03,  1.96] 

88 

3.6 

25x25 

1.0 

15 

[2.05,  1.98] 

46 

5.4 

78x78 

0.5 

15 

[1.31,  1.27] 

90 

5.4 

55x55 

0.7 

15 

[1 .33,  1 .28] 

89 

5.4 

39x39 

1.0 

15 

[1.31,  1.27] 

43 

10.8 

155x155 

0.5 

15 

[0.66,  0.64] 

91 

10.8 

110x110 

0.7 

15 

[0.66,  0.64] 

91 

10.8 

77x77 

1.0 

15 

[0.66,  0.64] 

52 

50 


^theory  (*,)>)  =  exp 


~(x  -  Xaf/lal  +  (y  -  y<>)2/2o* 


ax  =  1.0281  o/(dA  cos  60)  and  ay  =  1.02820/dA, 

(10) 


where  the  pattern  widths  are  given  by  1.028 20/Z)  and  D  = 
dA  cos  00  is  the  projected  array  diameter.  The  results  are 
shown  in  Table  2  and  Fig.  4  where  encircled  energy  is  plotted 
as  a  function  of  angular  radius  for  array  layouts  shown  in 
Table  2  rows  5  to  7  for  15  deg  scan  angles.  From  Fig.  4 
it  is  clear  that  arrays  with  smaller  element  spacing  rapidly 
approach  90%  encircled  energy  as  angle  increases.  However, 
for  the  array  with  element  spacing,  only  about  40%  of  the 
total  energy  is  contained  within  the  FWHM  of  Ftheory  due 
to  grating  lobes.  This  is  also  reflected  in  the  results 
shown  in  Table  2  for  all  layouts  with  20  element  spacing; 
about  50%  or  less  energy  is  contained  within  a  small  angular 
radius  of  the  main  beam. 

Since  grating  lobes  can  contain  a  large  fraction  of  the  total 
power,  this  limits  the  spacing  of  array  elements  to  <20.  The 
results  show  that  0.720  spacing  is  quite  comparable  to  0.520 
and  a  good  option.  Having  a  significant  fraction  of  power 
located  in  grating  lobes  is  detrimental  to  the  system  effi¬ 
ciency  since  the  gain  of  the  main  beam  and  the  energy 
coupled  into  the  optical  system  are  reduced.  Even  more  pro¬ 
blematic  is  the  increased  level  of  image  clutter  that  can  arise 
if  unwanted  lobes  propagate  through  the  optical  system.  The 
reflectivity  characteristics  of  terrestrial  scenes  at  millimeter- 
wave  frequencies  can  vary  by  greater  than  20  dB,  and  false 
returns  may  be  confused  with  those  of  the  main  beam.  This 
problem,  including  the  number  and  intensity  of  such  lobes  is 
explored  further  for  our  optical  system  in  the  next  section. 


3.2  Element  Phase  Quantization 

3.2.1  Continuous  phase  shift 

The  previous  section  assumes  a  continuous  phase  shift  is 
applied  across  the  array  to  achieve  beam  scanning.  The 
amount  of  phase  shift  between  elements  needed  to  steer 


the  main  beam  of  a  uniform  array  over  a  given  angular 
range  is  determined  by  the  array  geometry  and  the  operating 
wavelength.  The  progressive  phase  shift  required  to  achieve 
a  particular  scan  angle  is: 


cos(90  -  02), 


(11) 


where  d  is  the  element  separation  and  A0  is  the  operating 
wavelength.  For  large  arrays  with  uniform  spacing  of 
order  20/2  or  more,  each  element  must  be  capable  of 
360  deg  of  phase  shift.  The  maximum  required  progressive 
phase  shift  between  elements  to  cover  the  full  field  of  view  is 
about  40  deg.  Scanning  the  beam  over  this  field  of  view  in 
half-beamwidth  increments  requires  a  minimal  progressive 
phase  shift  of  about  2  deg.  This  implies  that  an  eight-bit 
phase  shifter  is  required. 

Phase  shifters  with  a  360  deg  tuning  range  and  an 
accuracy  and  reproducibility  of  2  deg  have  not  been  demon¬ 
strated  at  220  GHz.  Also,  developing  digital  phase  shifters 
with  seven  to  eightbits  is  prohibitive  at  these  frequencies 
due  to  overall  unit  cell  size  and  the  complexity  of  bias 
routing,  among  other  considerations.  At  frequencies  around 
200  GHz,  several  phase  shifter  designs  have  been  success¬ 
fully  demonstrated.  For  example,  phase  shifters  using 
GaAs  Schottky  varactor  diodes  integrated  with  90  deg  hybrid 
microstrip  circuits  exhibited  180  deg  of  phase  shift  with 
errors  of  ±15  deg  over  a  195  to  250  GHz  frequency 
range.15  Also,  distributed  MEMS  transmission  lines  have 
been  demonstrated  at  G-band  (140  to  220  GHz),  including 
4-  and  15 -element  designs  based  on  switched  MEMS  capa¬ 
citors,6  and  single-bit  switched-line  phase  shifters  have 
been  proposed  for  600  GHz  operation.3  Achieving  beam 
scanning  via  continuous  progressive  phase  shift  is  not  prac¬ 
tical  given  the  stringent  requirements  on  total  tunable  range 
and  accuracy  mentioned  above.  For  this  reason,  the  rest  of 
this  section  explores  properties  of  arrays  with  quantized 
phase  shift  capabilities. 
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Fig.  4  (a)  Encircled  energy  calculations  for  similar  array  layouts  containing  different  element  spacings  are  shown  for  a  desired  scan  angle  of 
15  deg.  Two  best  case  curves  are  presented  as  Gaussian  fits  to  both  the  Airy  pattern  of  a  solid  reflector  (Gaussian  Theory)  and  array  factor 
power  pattern  (Gaussian  Fit),  (b)  A  zoomed  version  of  (a)  showing  the  same  encircled  energy  curves  along  with  the  results  of  fitted  and  theoretical 
Gaussian  curves.  For  all  layouts  except  the  one  with  A0  element  spacing,  the  encircled  energy  rapidly  approaches  >90%  levels  within  small  angular 
radii. 
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Fig.  5  Phase  needed  to  achieve  a  desired  scan  angle  of  (0o,0o)  =  (7  deg,  5  deg)  is  shown  as  a  function  of  element  number  across  a  78  x  78 
element  half-wavelength  spaced  array  for  different  phase  quantization  cases  up  to  three  bits. 


3.2.2  2-D  rectangular  lattice  array  with  quantized  phase 
shift 

Using  Eq.  (9)  phase  quantization  is  applied  by  replacing  the 
ideal  phases  with  quantized  phase  states,  aq  a  and  aq  a ,  that 
are  multiples  of  2/r/2/  mod  2 n,  where  /  is  the  number  of  bits. 
As  an  example,  Fig.  5  shows  the  phase  shifts  needed  for  the 
78  x  78  element  half- wavelength  spaced  array  in  Table  2 
(see  row  5)  to  achieve  an  arbitrary  scan  angle  of 
(<90,^o)  =  (7  deg,  5  deg).  Phase  shift  is  shown  as  a  func¬ 
tion  of  array  element  number  for  continuous  ideal  phase 
shift  and  for  one  to  three  bit  phase  quantization.  The  result¬ 
ing  array  factors  are  shown  in  Fig.  6  for  arrays  with  one  to 
three  bit  phase  shifters.  The  array  factors  are  normalized  to 
the  peak  intensity  values  and  are  plotted  in  direction  cosines. 
The  grayscale  range  has  been  restricted  to  bring  out  lower 
level  structure  in  the  images,  highlighting  features  at  the 
—  10  to  —20  dB  level. 

The  most  important  feature  in  Fig.  6  is  the  appearance  of 
quantization  lobes,  which  occur  in  antenna  arrays  or  diffrac¬ 
tive  optical  elements  with  finite  phase  quantization.16-18  The 
quantization  lobes  are  most  significant,  in  terms  of  quantity 
and  magnitude,  for  the  array  with  one-bit  phase  quantization. 
For  the  one-bit  case,  three  quantization  lobes  with  equal 
amplitude  to  the  main  beam  appear,  symmetric  about  broad¬ 
side.  Quantization  lobes  also  appear  in  array  factors  with 
two-bit  and  three-bit  phase  shifters,  but  their  significance 


decreases  as  the  number  of  quantization  levels  increases. 
These  lobes  are  weaker  for  a  reflectarray  illuminated  by  a 
spherical  wave  front  compared  with  a  focusing  feed  reflector. 

Quantization  lobes  affect  system  performance  in  several 
ways,  resulting  in  scan  angle  errors  and  decreased  main 
beam  directivity.  Additionally,  quantization  lobes  are  clearly 
harmful  if  they  are  in  close  angular  proximity  to  the  main 
beam  and  can  therefore  propagate  through  the  optical  sys¬ 
tem.  Their  effects  on  the  imaging  performance  of  the  system 
presented  in  Sec.  2.2  will  be  discussed  in  terms  of  beam 
pointing  accuracy,  the  number  and  intensity  of  lobes  appear¬ 
ing  over  the  entire  desired  scan  region,  and  their  impact  on 
the  fraction  of  total  energy  contained  within  the  main  beam 
as  a  function  of  scan  angle. 

The  remainder  of  this  section  compares  the  relative  ima¬ 
ging  performance  of  the  system  described  in  Sec.  2.2  with  a 
half- wavelength  spaced,  3 -inch  diameter  array  with  one-  and 
two-bit  phase  quantization.  The  decision  to  focus  on  one- 
and  two-bit  phase  quantization  is  based  on  a  desire  to  mini¬ 
mize  fabrication  complexity  of  the  reflectarray.  This  is 
particularly  important  for  large  half-wavelength  spaced 
arrays  where  minimization  of  unit  cell  area  is  highly  bene¬ 
ficial.  For  example,  the  5.4  cm  78  x  78  element  array  shown 
in  Table  2  requires  more  than  6000  elements  for  a  one-bit 
phase  shifter  design.  Since  even  the  simplest  unit  cell 
designs,  such  as  switched  line  phase  shifters,  can  require 
many  switches  to  realize  2b  phase  states  where  b  is  the 


Fig.  6  Plot  of  the  array  factors  for  arrays  with  one-  to  three-bit  phase  quantization  shown  in  Fig.  5.  The  array  factors  have  been  normalized  to  the 
peak  intensity  of  the  main  lobes  and  are  shown  plotted  in  direction  cosines. 
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number  of  bits,  the  feasibility  of  bias  routing  quickly 
becomes  a  significant  concern  along  with  overall  array  com¬ 
plexity.  The  physical  size  of  phase  shifting  elements  is  an 
additional  hurdle  that  makes  it  difficult  to  implement 
multi-bit  phase  shifters  while  maintaining  half-wavelength 
spacing.  For  this  reason,  we  choose  to  examine  tradeoffs 
between  systems  with  one-  and  two-bit  phase  quantization. 


3.2.3  Beam  pointing  accuracy 

The  introduction  of  phase  quantization  leads  to  beam  point¬ 
ing  errors  resulting  from  array  phase  error.  This  error  is  most 
severe  for  scan  angles  near  broadside  where  the  change 
between  phase  states  is  large  and  for  angles  with  an  integer 
number  of  elements  per  phase  step.  Studies  of  one¬ 
dimensional  (1-D)  arrays  reveal  scan  errors  ranging  from 
a  few  tenths  to  a  few  thousandths  of  a  degree  depending 
on  array  size,  number  of  elements,  and  number  of  bits.19 
We  analyze  the  effects  of  beam  pointing  error  for  the  system 
described  in  Sec.  2.2  with  a  half-wavelength  spaced  78  X  78 
element  array  (Table  2,  row  5)  and  examine  the  magnitude  of 
this  error  with  one-  and  two-bit  phase  quantization  over  the 
desired  scan  range  of  the  optical  system. 

To  evaluate  beam  pointing  accuracy,  array  factors  were 
calculated  for  arrays  with  one-  and  two-bit  phase  quantiza¬ 
tion  using  Eq.  (9)  over  all  scan  angles  (0O,  4>o)  covering  azi¬ 
muth  and  elevation  ranges  compatible  with  the  optical  design 
shown  in  Table  1.  Az/El  scan  angles  of  ±13  deg  (for  a  total 
of  26  deg)  are  needed  at  the  secondary  to  map  the  desired 
field  of  view.  We  assume  a  minimum  scan  angle  increment 
of  0.65  deg  on  the  secondary.  The  26  deg  scan  region  in 
Az/El  was  divided  into  individual  desired  pointing  angles 
(Az0,  El0)  based  on  the  minimum  scan  angle  increment, 
and  array  factors  were  generated  for  each  location.  For  all 
scan  angles,  array  factors  were  generated  over  a  finely 
meshed  grid  with  angular  spacing  much  smaller  than  the 
minimum  scan  angle  in  regions  surrounding  the  expected 
location  of  the  main  lobe.  The  beam  pointing  error  (A^) 
was  determined  by  the  magnitude  of  the  difference 
between  the  desired  scan  angle  (Az0,  El0)  and  the  location 


of  the  array  factor  maximum  peak  (Azpeak,  Elpeak):  Ap  = 
\J (Az0  -  Azpeak)2  +  (El0  -  Elpeak)2 . 

The  results  are  shown  in  Fig.  7  with  the  magnitude  of 
beam  pointing  error  plotted  as  a  grayscale  image  over  the 
total  Az/El  region  for  arrays  that  differ  only  in  their  phase 
quantization.  The  range  of  beam  pointing  error  shown  in 
the  figure  has  been  clipped  at  0.3  deg  to  highlight  lower- 
level  detail;  for  comparison  purposes  both  panels  of 
Fig.  7  have  been  plotted  with  this  intensity  scale.  Several  fea¬ 
tures  are  apparent  in  the  plots.  The  relatively  large  errors 
observed  around  the  origin  and  in  stripes  near  the  plot 
axes  and  at  ±15  deg  elevation  are  due  to  large  changes 
in  phase  states.  The  differences  between  achieved  and 
desired  scan  angles  are  particularly  large  for  angles  that 
are  nearly  on-axis  for  the  one-bit  phase  shifter  case, 
where  the  phase  states  abruptly  change  by  n.  This  error 
also  appears  to  a  lesser  degree  in  the  two-bit  phase  shifter 
plot.  The  average  and  median  scan  angle  errors  over  the 
entire  regions  are  shown  in  Fig.  7,  (0.07,  0.05  deg)  and 
(0.04,  0.03  deg)  for  both  cases,  respectively.  However,  as 
highlighted  in  the  plot,  there  are  regions  where  the  scan 
angle  error  is  much  greater,  and  the  maximum  error  levels 
for  the  one-  and  two-bit  phase  quantization  cases  are  0.6 
and  0.2  deg,  respectively.  This  is  a  significant  fraction  of 
the  required  minimum  scan  angle  increment  of  0.65  deg. 

The  result  is  that  for  the  one-bit  phase  shifter  case,  a 
restricted  scan  range  is  needed  in  order  to  avoid  these 
regions.  For  example,  scanning  in  the  ±Az/  ±  El  quadrant 
away  from  the  origin  and  axes  leads  to  maximum  pixel  mis¬ 
registration  levels  of  less  than  0.3  deg.  Beam  pointing  accu¬ 
racy  with  an  array  that  has  two-bit  phase  quantization 
alleviates  pointing  error  issues  even  further.  Regions  with 
the  largest  scan  angle  errors  result  in  pointing  errors  of 
about  15%  of  the  diffraction-limited  resolution.  In  this 
case,  no  restriction  of  scan  range  is  necessary,  enabling 
more  flexibility  in  the  optical  system  design  if  needed. 

3.2.4  Effects  of  quantization  lobes 

Another  problem  associated  with  phase  quantization  that 
results  in  degraded  imaging  performance  is  the  appearance 
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Fig.  7  (a)  The  magnitude  of  steering  angle  error  in  degrees  for  an  array  with  one-bit  phase  quantization  is  shown  as  a  function  of  scan  ranges  of 
interest  for  our  optical  system,  (b)  Shows  steering  angle  error  as  a  function  of  scan  angle  for  the  two-bit  phase  quantization  case.  For  comparison 
purposes,  the  plot  range  matches  the  results  of  (a). 
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of  quantization  lobes  in  close  angular  proximity  to  the  main 
beam.  If  these  lobes  are  not  spatially  blocked  and  propagate 
through  the  optical  system,  they  can  cause  false  returns  and 
introduce  artifacts.  This  problem  is  compounded  for  applica¬ 
tions  involving  imaging  of  moving  targets  at  high  frame  rates 
where  the  scene  may  be  rapidly  changing  from  frame  to 
frame.  If  one  assumes  that  energy  outside  of  the  desired 
angular  scan  region  is  terminated  in  an  absorptive  load, 
the  optical  system  presented  in  Sec.  2.2  will  perform 
some  spatial  filtering.  This  is  particularly  important  for 
scanned  arrays  with  one-bit  phase  quantization,  where  quan¬ 
tization  lobes  including  ones  of  equal  intensity  to  the  main 
beam  appear  symmetrically  about  the  origin  in  Az/El  (see 
Fig.  6).  However,  this  does  not  address  the  appearance  of 
quantization  lobes  within  the  desired  Az/El  scan  region. 
For  this  reason,  we  compare  characteristics  of  arrays  with 
one-  and  two-bit  phase  quantization  in  terms  of  their  effects 
on  the  imaging  performance  of  the  optical  system  shown  in 
Table  1 .  This  work  examines  the  angular  location,  number, 
and  intensity  of  quantization  lobes  calculated  over  the  scan 
range  of  interest  for  these  particular  setups. 

Figure  6  displays  the  angular  locations  of  quantization 
lobes  for  a  78  X  78  element  half- wavelength  spaced  square 
lattice  array  for  one  particular  scan  angle.  To  assess  the 
impact  of  quantization  lobes  over  the  total  desired  scan 
region,  array  factors  were  generated  in  a  manner  similar 
to  the  previous  section  for  scan  angle  increments  of 
0.65  deg  over  a  total  scan  region  including  0  to  26  deg  in 
+Az/  +  El.  The  fraction  of  energy  contained  inside  the 
main  beam  was  calculated  and  used  to  determine  the  impact 
of  quantization  lobes.  Individual  summed  energy  results 
were  normalized  to  the  total  energy  calculated  over  the 
whole  mapped  region  for  each  array  factor.  The  angular 
radius  of  the  main  beam  was  determined  by  the  diffraction 
limited  1 1 e1  radius  of  a  theoretical  Gaussian  due  to  the  array 
Eq.  (10)  calculated  for  each  desired  scan  angle. 

Using  this  method,  the  fraction  of  energy  in  the  main 
beam  was  computed  over  all  scan  angles  in  the  +Az/  +  El 


quadrant  for  the  one-  and  two-bit  phase  quantization  cases. 
The  beam  efficiency  results  are  shown  in  Fig.  8  as  a  function 
of  pointing  angle.  The  grayscale  range  has  been  restricted  to 
highlight  energy  fractions  between  10%  and  50%.  There  is 
comparatively  little  structure  in  the  plots  near  the  origin 
where  the  energy  fraction  is  the  greatest  (>70%).  The  aver¬ 
age  fraction  of  energy  in  the  main  beam  for  arrays  with  both 
one-  and  two-bit  phase  shifters  is  small,  about  20%  and  23%, 
respectively,  ignoring  energy  lost  to  other  quadrants.  It  is  evi¬ 
dent  that  the  impact  of  quantization  lobes  and  side  lobes  on 
the  main  beam  energy  fraction  is  significant.  There  is  an 
additional  25%  efficiency  inherent  to  one-bit  phase  quanti¬ 
zation  since  this  plot  neglects  the  radiated  energy  appearing 
symmetrically  about  the  Az/El  origin  (see  Fig.  6).  Although 
this  decreases  the  energy  efficiency,  it  is  not  an  insurmoun¬ 
table  problem  as  long  as  enough  source  power  is  available  to 
the  radar  imaging  system  and  the  other  quadrants  are 
spatially  filtered. 

The  array  main  beam  energy  efficiency  results  do  not  pro¬ 
vide  direct  information  about  the  number  and  intensity  of  the 
lobes.  To  this  end,  we  used  the  same  array  factors  generated 
over  the  +Az /  +  El  scan  region  to  calculate  the  number  of 
quantization  lobes  appearing  over  the  scan  range  of  interest 
as  a  function  of  lobe  intensity.  To  perform  this  calculation, 
we  set  a  peak  threshold  relative  to  the  main  beam  of  -20  dB. 
All  lobes  at  and  above  this  threshold  were  tabulated  in  bins  of 
1  dB  for  each  array  factor.  The  results  were  summed  by  bin 
and  normalized  by  the  total  number  of  array  factors  calcu¬ 
lated.  Figure  9  shows  a  histogram  of  the  results  for  arrays 
with  one-  and  two-bit  phase  shifters. 

As  expected,  the  total  number  of  lobes/AF  summed  across 
all  bins  is  greater  for  one-bit  phase  quantization.  Generally, 
for  the  two-bit  phase  quantization  case,  there  are  fewer  over¬ 
all  lobes,  including  those  of  comparable  intensity  to  the  main 
beam  (this  is  seen  down  to  the  -6  dB  level).  For  the  one-  and 
two-bit  phase  quantization  cases,  the  total  number  of  lobes/ 
AF  appearing  over  the  scanned  region  down  to  the  -20  dB 
level  is  about  60  and  50,  respectively.  Most  of  these  lobes 


Fig.  8  (a)  The  fraction  of  energy  contained  within  the  main  beam  is  shown  over  the  desired  Az/El  scan  range  for  an  array  with  one-bit  phase 
quantization.  The  average  energy  fraction  is  <20%.  (b)  Shows  the  energy  fraction  within  the  main  beam  for  the  two-bit  phase  quantization 
case  calculated  over  the  same  scan  region.  The  average  energy  fraction  is  about  25%  and  for  easy  comparison;  the  plot  range  matches  the 
results  of  (a). 


54 


0) 

J3 

E 

D 


0) 

n 

o 


<L> 

_Q 

e 

3 


Fig.  9  Histograms  of  the  number  of  quantization  lobes  normalized  to 
the  total  number  of  array  factors  calculated  over  the  scan  range  of 
interest  are  shown  binned  by  lobe  amplitude  relative  to  the  main 
beam.  Results  are  shown  for  one-  and  two-bit  phase  quantization 
cases. 


appear  at  lower  intensity  levels.  For  example,  considering 
lobes  ranging  between  0  dB  and  -15  dB  reduces  both  of 
these  numbers  by  more  than  half  to  about  20  lobes  total. 

As  shown  in  Fig.  9,  a  significant  number  of  lobes  appear 
above  the  -20  dB  threshold  over  the  total  desired  scan 
region.  These  lobes  are  potentially  harmful  since  they  pro¬ 
pagate  through  the  optical  system  and  can  result  in  spurious 
returns.  As  discussed  above,  they  are  less  severe  in  terms  of 
number  and  intensity  for  the  array  with  two-bit  phase  shif¬ 
ters,  but  they  are  prevalent.  In  order  to  mitigate  or  eliminate 
the  possibility  of  false  returns  and  irreparable  imaging  arti¬ 
facts,  it  may  be  advantageous  to  consider  alternatives.  Since 
larger  arrays  and  arrays  with  more  phase  quantization  states 
add  considerably  to  overall  complexity,  contributing  to 
future  fabrication  difficulties,  examining  the  performance 
of  other  array  geometries,  such  as  aperiodic  arrays,  may 
provide  better  results. 


3.3  Reflectarray  Elements 

3.3.1  Antennas 

Antenna  elements  compatible  with  large  scanning  reflectar- 
rays  must  meet  several  criteria.  They  need  to  be  compatible 
with  wafer  scale  assembly  and  phase  shifter  integration. 
They  must  also  be  compatible  with  unit  cell  separations 
less  than  20  and  incorporation  into  large-format  2-D  arrays 
with  low  mutual  coupling.  At  lower  frequencies  up  to 
60  GHz,  reconfigurable  reflectarrays  have  been  proposed 
and  demonstrated  with  microstrip  patches,20-22  although  a 
variety  of  planar  antenna  elements  including  dipoles,  copla- 
nar  waveguide  (CPW),  and  slot  antennas  may  also  suffice.  In 
the  high  frequency  millimeter-wave  region,  suitable  antenna 
element  choices  are  more  limited.  At  frequencies  higher  than 
120  GHz,  the  small  widths  required  for  slot  antennas  and 
CPW  make  physical  implementation  of  these  structures  chal¬ 
lenging.3  Planar  dipoles  are  a  possibility  but  may  not  be  com¬ 
patible  with  common  phase  shifter  geometries,  including 
MEMS  switches.  On  the  other  hand,  microstrip  patches 
meet  all  of  the  criteria  and  are  a  favorable  choice  for  imple¬ 
menting  a  scanned  220  GHz  reflectarray. 


3.3.2  Element  integration 

Adopting  a  simple  element  structure  is  important  for  large 
array  construction,  particularly  since  integration  of  antennas 
with  phase  shifters  has  not  yet  been  demonstrated  at  frequen¬ 
cies  as  high  as  220  GHz.  A  possible  unit  cell  configuration 
that  has  been  implemented  at  60  GHz21  and  proposed  for  use 
at  frequencies  as  high  as  600  GHz3  uses  a  microstrip  patch 
antenna  connected  to  a  short-circuited  stub  that  is  loaded 
with  either  a  diode  or  MEMS  switch.  The  switch  or  diode 
acts  as  a  one-bit  phase  shifter,  providing  either  an  open  or 
short  impedance,  and  results  in  a  180  deg  phase  shift 
between  the  two  states.  This  simple  geometry  is  promising 
for  realizing  220  GHz  unit  cells,  although  the  size  of  phase 
shifters  and  MEMS  switches  is  a  concern4  at  these  frequen¬ 
cies.  Their  dimensions  can  be  comparable  to  patch  antennas 
and  viable  unit  cell  designs  must  minimize  the  possibility  of 
radio  frequency  (RF)  signal  coupling  to  phase  shifters.  Digi¬ 
tal  control  is  a  key  capability  for  a  large  array.  Unit  cell 
designs  incorporating  electronically  controlled  components 
are  therefore  advantageous.  G-band  MEMS-based  varactors 
and  switched  capacitors  have  been  demonstrated  with  digital 
control,6  providing  a  path  to  realizing  array  unit  cells.  An 
added  benefit  of  these  components  for  large  arrays  is  the 
use  of  fabrication  techniques  that  are  compatible  with 
CMOS  processes  and  large-volume  production  methods. 

4  Conclusions 

In  this  work  we  analyze  the  performance  of  a  system  that 
uses  a  scanning  reflectarray,  or  reflection-type  phased 
array,  to  perform  electronic  scanning  of  a  confocal  reflector 
system  for  applications  including  video-rate  imaging  of 
meter-sized  targets  at  standoff  distances  of  50  m.  This 
work  analyzes  a  system  design  for  operation  in  the 
220  GHz  atmospheric  window.  Designs  incorporating  reflec¬ 
tarrays  with  confocal  imagers  have  not  previously  been 
explored  at  this  frequency  range.  The  220  GHz  window 
was  chosen  as  a  favorable  compromise  between  aperture  dia¬ 
meter,  resolution,  component  availability,  signal  attenuation, 
and  transmissivity /reflectivity  properties  of  targets.  Details 
of  the  design  are  presented  in  this  work.  A  confocal  reflector 
system  was  chosen  mainly  for  its  ability  to  scan  a  large  aper¬ 
ture  with  a  much  smaller  scanning  element.  This  is  particu¬ 
larly  attractive  since  large-format  electronically  scanned 
arrays  have  not  been  demonstrated  at  these  frequencies.  A 
1  m  primary  aperture  was  selected  as  a  tradeoff  between  sys¬ 
tem  size  and  resolution.  A  3-inch  wafer-scale  array  was  cho¬ 
sen  based  on  compromises  between  system  size,  total  array 
elements  required,  and  achievable  number  of  image  pixels. 
Properties  of  uniform  rectangular  lattice  arrays  of  different 
sizes  and  element  spacings  compatible  with  the  confocal 
reflector  system  were  examined.  Encircled  energy  calcula¬ 
tions  revealed  that  grating  lobes  can  contain  large  fractions 
of  the  total  energy  (>50%)  over  the  angular  scan  ranges  of 
interest  for  arrays  with  element  spacings  near  20.  Other  array 
geometries  that  would  result  in  lower  grating  lobe  levels 
should  be  considered  if  element  spacing  near  20/2  is  unac¬ 
hievable.  Effects  of  array  element  phase  quantization  were 
considered  since  achieving  continuous  phase  shift  capabil¬ 
ities  compatible  with  the  demands  of  our  application 
(360  deg  total  phase  shift  with  2  deg  precision)  is  impracti¬ 
cal.  Properties  of  2-D  rectangular  arrays  with  one-  and 
two-bit  phase  shifters  and  their  impacts  on  the  imaging 
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performance  of  our  system  were  examined.  The  choice  to 
focus  on  one-  and  two-bit  phase  quantization  was  a  result 
of  our  desire  to  minimize  the  complexity  of  the  array  for  fab¬ 
rication.  Beam  pointing  error  was  examined  over  scan  ranges 
of  interest  for  our  optical  system.  For  the  one-bit  phase  quan¬ 
tization  case,  large  scan  angle  errors  were  encountered, 
exceeding  25%  of  the  diffraction-limited  resolution  at  the 
worst  locations.  Two-bit  phase  quantization  was  better, 
resulting  in  pointing  angle  errors  of  at  most  15%  of  the 
spot  size.  As  a  method  of  assessing  the  impact  of  quantiza¬ 
tion  lobes  as  a  function  of  scan  angle,  array  main  beam  effi¬ 
ciency  was  calculated  over  the  scan  range  of  interest.  The 
average  energy  fraction  in  the  main  beam  was  comparable 
for  the  one-  and  two-bit  phase  quantization  cases  when  nor¬ 
malized  to  the  total  energy  over  the  +Az/  +  El  scanned 
quadrant.  The  number  and  intensity  of  quantization  lobes 
appearing  over  the  desired  scan  range  was  examined.  For 
both  the  one  and  two-bit  phase  quantization  cases,  a  signifi¬ 
cant  number  of  lobes  appeared  at  intensity  levels  above  the 
-20  dB  threshold.  These  lobes  are  potentially  harmful  to  our 
imaging  system  since  they  propagate  through  the  optical  sys¬ 
tem  and  may  compete  with  the  main  beam,  producing  arti¬ 
facts  and  false  returns.  In  order  to  mitigate  these  problems, 
other  array  geometries  will  be  considered  in  the  future. 
Design  and  integration  of  reflectarray  elements  was  briefly 
considered.  Based  on  currently  demonstrated  component 
technology  and  a  desire  to  simplify  element  layout,  unit 
cells  incorporating  patch  antennas  and  one-bit  digitally 
controlled  phase  shifters  are  favorable  for  G-band  array 
implementation. 
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Samples  of  HgCdSe  alloys  were  grown  via  molecular  beam  epitaxy  on  thick  ZnTe  buffer  layers  on  Si 
substrates.  Two  Se  sources  were  used:  an  effusion  cell  loaded  with  5N  source  material  that  produced 
a  predominantly  Se6  beam  and  a  cracker  loaded  with  6N  material  that  could  produce  a  predominantly 
Se2  beam.  The  background  electron  concentration  in  as-grown  samples  was  significantly  reduced  by 
switching  to  the  Se  cracker  source,  going  from  10l7-1018 cm  3  to  3-5  x  1016cm“3  at  12  K.  The 
concentration  remained  low  even  when  the  cracking  zone  temperature  was  lowered  to  produce  a 
predominantly  Se6  beam,  which  strongly  suggests  that  a  major  source  of  donor  defects  is  impurities 
from  the  Se  source  material  rather  than  Se  species.  Secondary  ion  mass  spectroscopy  was  performed. 
Likely  donors  such  as  F,  Br,  and  Cl  were  detected  at  the  ZnTe  interface  while  C,  O,  and  Si  were 
found  at  the  interface  and  in  the  top  1 .5  pm  from  the  surface  in  all  samples  measured.  The  electron 
concentration  for  all  samples  increased  when  annealed  in  a  Cd  or  Hg  overpressure  and  decreased 
when  annealed  under  Se.  This  suggests  the  presence  of  native  defects  such  as  vacancies  and 
interstitials  in  addition  to  impurities.  Overall,  by  switching  to  higher  purity  Se  material  and  then 
annealing  under  Se  overpressures,  the  background  electron  concentration  was  reduced  by  an  order  of 
magnitude,  with  the  lowest  value  achieved  being  9.4  x  1015  cm-3  at  12  K.  ©  2013  American  Vacuum 
Society.  [http://dx.doi.Org/10.l  1 16/1.479865 1] 

I.  INTRODUCTION 

Currently,  the  infrared  material  of  choice  is  mercury  cad¬ 
mium  telluride  (MCT).  MCT  is  a  ternary  alloy  with  a 
bandgap  that  can  be  tuned  from  the  short  wave  infrared 
(SWIR)  to  the  very  long  wave  infrared  (VLWIR).  High  qual¬ 
ity  MCT  can  be  grown  via  molecular  beam  epitaxy  (MBE) 
on  bulk  lattice-matched  cadmium  zinc  telluride  (CZT),  with 
dislocation  densities  ^105cm-2.  However,  bulk  CZT  has  a 
maximum  area  of  roughly  50  cm2,  making  it  unsuitable  for 
the  manufacture  of  a  large  area  focal  plane  array  (FPA). 

MCT  can  also  be  grown  by  MBE  on  silicon  (Si)  with  a  cad¬ 
mium  telluride  (CdTe)  buffer  layer.  Si  wafers  are  available 
in  diameters  at  least  as  large  as  10  in.,  but  the  19%  lattice 
mismatch  between  MCT  and  Si  results  in  large  dislocation 
densities  that  limit  device  performance,  particularly  for  long 
wave  infrared  (LWIR)  MCT.1 

An  alternative  material  is  mercury  cadmium  selenide 
(MCS).  Like  MCT,  MCS  is  a  ternary  alloy  with  a  bandgap 
tunable  from  the  SWIR  to  the  VLWIR.  MCS  belongs  to  a 
family  of  materials  with  lattice  parameters  near  6. 1  A.  GaSb, 
another  member  of  this  family,  is  now  available  in  wafers 
with  a  diameter  of  4  in.,  with  6  in.  diameter  GaSb  wafers 
currently  under  development.  Additionally,  this  6.1  A  family 
also  includes  materials  with  band  gaps  suitable  for  detection 
applications  in  the  visible  and  ultraviolet  spectral  ranges. 

Therefore,  one  could  conceivably  create  a  device  made  from 
lattice-matched  materials  capable  of  sensing  from  the  ultra¬ 
violet  to  the  VLWIR  on  a  single  chip.2 


a)Electronic  mail:  Kevin.doyle.30.ctr@mail.mil 


One  obstacle  to  the  use  of  MCS  for  devices  has  been  the 
large  background  electron  concentration  that  has  been 
reported  for  this  material.  Despite  not  being  intentionally 
doped,  MCS  samples  typically  had  electron  concentrations 
greater  than  1017  cm  3  at  77  K  whether  in  the  form  of  bulk 
samples3  or  of  epitaxial  layers  deposited  by  MBE.4  The  elec¬ 
tron  concentration  remained  high  with  little  variation  even  at 
temperatures  as  low  as  4K,  suggesting  the  presence  of  a 
shallow  donor  level  located  near  or  within  the  conduction 
band.  The  background  concentration  could  either  be  reduced 
or  increased  by  annealing  under  various  conditions,  suggest¬ 
ing  the  presence  of  native  defects  such  as  vacancies  and 
interstitials.5  Sources  of  these  donor  defects  need  to  be  iden¬ 
tified  so  that  a  process  to  eliminate  them  either  during 
growth  or  through  postgrowth  annealing  can  be  developed. 

II.  EXPERIMENT 

MCS  samples  were  grown  via  MBE  on  Si  substrates  with 
zinc  telluride  (ZnTe)  buffer  layers.6  The  samples  were 
grown  in  an  ultrahigh  vacuum  MBE  chamber  made  by  DCA 
Instruments.  The  substrates  were  mounted  on  molybdenum 
blocks  with  colloidal  graphite.  Immediately  prior  to  loading, 
the  ZnTe/Si  substrates  were  etched  in  a  0.2%  Br:Methanol 
solution  for  30  s  followed  by  a  brief  methanol  rinse,  a  10  s 
etch  in  10%  HC1,  a  60s  rinse  in  deionized  water,  and  then 
blown  dry  with  N2.  Once  loaded,  the  substrate  was  heated 
under  a  Te  overpressure  while  monitored  in  situ  by  reflection 
high  energy  electron  diffraction  to  remove  any  remaining 
oxides  prior  to  growth.  Clips  held  the  edges  of  each  sub¬ 
strate,  and  the  thickness  (and  therefore  the  growth  rate)  of 
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each  sample  could  be  determined  by  measuring  the  “step” 
created  by  the  clip  with  a  profilometer. 

MCS  samples  were  grown  using  elemental  mercury  (Hg), 
cadmium  (Cd),  and  selenium  (Se)  sources.  The  beam  equiva¬ 
lent  pressure  (BEP)  emanating  from  all  sources  was  meas¬ 
ured  with  a  beam  flux  monitor  (BFM)  consisting  of  a  nude 
ion  gauge  placed  directly  in  the  path  of  flux.  Quadruple  dis¬ 
tilled  Hg  was  supplied  by  a  600cc  valved  effusion  cell  and 
Cd  with  99.999%  (5N)  purity  was  supplied  by  a  400g 
SUMO  Cell,  both  made  by  Applied  EPI.  Initially,  a  Model 
VSbllO  effusion  cell  made  by  ADDON  was  used  to  supply 
5N  Se.  However,  Se  vapor  consists  of  many  polyatomic 
species  from  Se2  to  Se8,  and  at  this  effusion  cell’s  typical 
operating  temperature  of  325  °C  (598  K),  the  predominant 
species  of  Se  flux  was  Se6  (uncracked  Se).7 

The  Se  effusion  cell  was  replaced  with  a  Mark  V 
Selenium  Valved  Cracker  made  by  Veeco,  which  directed 
the  Se  vapor  through  a  cracking  zone,  which  could  be  heated 
up  to  800  °C  (1073  K)  to  produce  a  predominantly  Se2  beam 
(cracked  Se).7,8  Differences  in  the  ionization  efficiencies  of 
the  Se  atomic  species  resulted  in  different  sensitivities  of  the 
BFM  depending  on  which  species  were  dominant.  For  a 
fixed  reservoir  temperature  of  250  °C  and  a  fixed  valve  posi¬ 
tion  (to  maintain  a  constant  flux),  the  BEP  measured  for  the 
cracker  source  was  found  to  vary  with  the  cracking  zone 
temperature,  tracking  with  the  data  found  in  Ref.  7.  This  sug¬ 
gests  that  the  Se  flux  transitions  from  predominantly  Se6  to 
predominantly  Se5  at  around  650  K  and  then  to  predomi¬ 
nantly  Se2  near  900  K  (Fig.  1).  The  Se  BEP  measured  for  the 
typical  cracking  zone  temperature  of  800  °C  was  found  to  be 
close  to  a  factor  of  two  lower  than  at  the  typical  effusion  cell 
temperature  of  325  °C  for  the  same  amount  of  exiting  Se 
reflecting  a  difference  in  ionization  energy  for  the  various 
species.  This  correction  factor  was  applied  to  Se  BEP  from 
the  cracker  source  when  comparing  the  two  sources.  While 
the  effusion  cell  was  loaded  with  5N  purity  Se,  the  cracker 
was  loaded  with  Se  with  99.9999%  (6N)  purity. 

Samples  were  grown  at  different  temperatures  using  vari¬ 
ous  Cd  to  Se  and  Hg  to  Se  BEP  ratios.  Substrate  temperature 
was  measured  with  a  pyrometer  as  well  as  a  thermocouple  on 


Cracking  Zone  Temperature  (K) 


Fig.  1 .  Se  BEP  vs  cracking  zone  temperature  for  a  fixed  Se  reservoir  temper¬ 
ature  of  250  °C  and  valve  position  of  150  mils. 


the  sample  manipulator.  However,  samples  grown  using  the 
cracker  source  presented  some  difficulty  in  measuring  the 
substrate  temperature.  Heat  from  the  high  temperature  crack¬ 
ing  zone  was  reflected  off  the  substrate  into  the  pyrometer, 
making  it  more  difficult  to  obtain  an  accurate  measurement. 
An  estimate  of  substrate  temperature  was  determined  com¬ 
paring  thermocouple  and  pyrometer  temperature  readings. 

Cut  off  wavelength  was  determined  via  transmittance 
measurements  using  a  Fourier  transform  infrared  spectros¬ 
copy  and  the  molar  fraction  of  CdSe  in  the  MCS  alloy,  or 
x- value,  was  determined  from  this  measurement  using  the 
relationship  between  band  gap  and  x-value  developed  by 
Summers  and  Broerman.9  Hall  measurements  were  per¬ 
formed  over  a  range  of  temperatures  from  4  to  300  K,  on 
samples  subjected  to  various  postgrowth  anneals.  Finally, 
secondary  ion  mass  spectroscopy  (SIMS)  was  performed  by 
the  Charles  Evans  Analytical  Group. 

III.  RESULTS  AND  DISCUSSION 

A.  Growth  parameters 

Due  to  the  very  low  sticking  coefficient  of  Hg,  samples 
were  grown  with  large  Hg  BEPs  (^10-4  Torr).  For  a  fixed 
substrate  temperature  and  Hg  overpressure,  the  growth  rate 
varies  linearly  with  Se  BEP  (Fig.  2),  and  for  a  fixed  Se  BEP, 
the  x-value  can  be  controlled  by  the  Cd/Se  BEP  ratio 
(Fig.  3).  It  was  found  that  samples  grown  with  cracked  Se 
had  a  higher  x-value  than  samples  grown  with  uncracked  Se 
with  the  same  Cd/Se  ratio,  suggesting  greater  incorporation 
of  Cd  with  Se2  than  Se6.  Growth  rates  began  to  decrease  at 
approximately  130  °C  for  the  valved  source  and  150  °C  for 
the  cracker  source  (Fig.  4).  The  optimal  MBE  substrate  tem¬ 
perature  for  MCS  grown  with  an  Hg  BEP  of  2.5  x  10-4  Torr 
was  ^100  °C.  This  is  lower  than  the  optimal  temperature  for 
MCT  with  a  similar  Hg  BEP  (~  1 85  °C),  most  likely  due  to 
the  higher  vapor  pressure  of  Se  compared  to  Te. 

B.  Hall  measurements 

The  electron  concentration  versus  temperature  was  meas¬ 
ured  for  samples  grown  with  both  Se  sources  using  Hall 


Fig.  2.  (Color  online)  Growth  rate  vs  Se  BEP  for  both  the  effusion  cell  (Se6) 
and  the  cracker  source  (Se2). 
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Fig.  3.  (Color  online)  Cd  composition  vs  Se/Cd  BEP  ratio  for  both  the  effu-  Fig.  5.  (Color  online)  As-grown  electron  concentration  vs  temperature  for 
sion  cell  (Se6)  and  the  cracker  source  (Se2).  MCS  samples  grown  with  both  Se  sources. 


Effect  measurements  with  a  standard  magnetic  field  of  0.1  T. 
A  previous  study  of  MCS  grown  via  MBE  using  uncracked 
Se  reported  little  variation  in  electron  concentration  with 
temperature  below  100  K,  with  the  electron  concentration 
remaining  in  the  10l7-1018 cm  3  range  at  temperatures  as 
low  as  30  K.4  This  was  consistent  with  samples  grown  by  the 
effusion  cell,  but  MCS  samples  grown  with  the  cracker 
source  exhibited  a  temperature  dependency  with  as-grown 
12  K  electron  concentrations  in  the  1016-1017cm-3  range 
(Fig.  5).  The  relative  lack  of  carrier  freeze-out  and  the  lower 
electron  concentrations  with  the  cracker  source  suggest  the 
presence  of  donors  with  energy  levels  located  near  or  within 
the  conduction  band  that  were  significantly  reduced  by 
switching  to  the  Se  cracker  source. 

Two  differences  between  the  Se  effusion  cell  vs  the  Se 
cracker  source  that  could  explain  the  lower  electron  concen¬ 
trations  are  the  different  atomic  species  of  the  Se  beam 
(~Se6  vs  ^Se2)  and  the  higher  purity  source  material  in  the 
cracker  source  (5  N  vs  6N).  MCS  samples  were  grown  using 
the  6  N  Se  in  the  cracker  source,  with  the  cracking  zone  tem¬ 
perature  lowered  to  325  °C  to  produce  an  uncracked  predom¬ 
inantly  Se6  beam.  The  electron  concentration  remained  low 
even  when  the  cracking  zone  temperature  was  reduced  to 


Fig.  4.  (Color  online)  Growth  rate  vs  estimated  substrate  temperature  with  a 
fixed  Se  BEP  for  both  the  effusion  cell  (Se6)  and  the  cracker  source  (Se2). 


325  °C  (Fig.  6),  strongly  suggesting  that  the  reduction  in 
concentration  was  due  to  the  higher  purity  source  material 
and  not  the  predominantly  Se2  flux.  Electron  mobility  for  the 
MCS  samples  increased  as  the  x-value  decreased  (Fig.  7). 

A  prior  study  of  HgSe  annealed  under  Hg  and  Se  sug¬ 
gested  that  Hg  interstitials  (n-type),  Se  vacancies  (n-type), 
and  Hg  vacancies  (p-type)  were  possible  native  defects  in 
MCS.5  These  possibilities  were  investigated  by  subjecting 
MCS  samples  to  separate  24  hour,  250  °C  anneals  under  vac¬ 
uum,  a  Hg  overpressure,  a  Cd  overpressure,  or  a  Se  overpres¬ 
sure  in  sealed  quartz  ampoules.  The  electron  concentration 
always  increased  when  annealed  under  Hg  or  Cd,  and  the 
electron  concentration  was  reduced  for  samples  grown  with 
the  cracker  source  and  then  annealed  under  Se.  No  signifi¬ 
cant  changes  were  observed  for  samples  annealed  under  vac¬ 
uum  (Fig.  8).  RBS  measurements  indicated  an  increase  in  re¬ 
value  when  annealed  under  Cd,  but  no  significant  change  in 
composition  with  the  other  anneals. 

The  lowest  12  K  electron  concentration  achieved  was  for 
a  Se-annealed  sample  with  an  x-value  of  0.15.  The  as-grown 
12  K  concentration  of  1.2xl016cm-3  was  reduced  to 
9.4  x  1015cm-3  after  annealing  under  Se.  Overall,  switching 


Fig.  6.  (Color  online)  As-grown  electron  concentration  at  77  K  vs  x-value 
for  the  effusion  cell  (Se6),  the  cracker  source  at  typical  operating  tempera¬ 
tures  (Se2),  and  cracker  source  with  the  cracking  zone  temperature  at  325  °C 
(Se6). 
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Fig.  7.  (Color  online)  As-grown  electron  mobility  at  77  K  vs  x-value  for  the 
effusion  cell  (Se6),  the  cracker  source  at  typical  operating  temperatures 
(Se2),  and  cracker  source  with  the  cracking  zone  temperature  at  325  °C 
(Se6). 


to  the  higher  purity  Se  and  then  annealing  under  a  Se  over¬ 
pressure  typically  reduced  the  background  electron  concen¬ 
tration  by  an  order  of  magnitude  at  77  K.  Direct  studies  of 
vacancies  and  interstitials  are  being  performed  through  posi¬ 
tron  annihilation  spectroscopy  and  Rutherford  backscattering 
channeling  spectroscopy,  respectively.  These  results  will  be 
presented  at  a  later  date. 

C.  SIMS  measurements 

SIMS  measurements  were  conducted  on  an  MCS  sample 
grown  with  the  effusion  cell  (thickness  =  7.3  /urn),  an  MCS 
sample  grown  with  the  cracker  source  (thickness  =  3.5  /un), 
and  a  HgSe  sample  grown  with  the  effusion  cell  (thickness 
=  3.4/im)  in  order  to  identify  unintentional  impurities 
(Fig.  9).  For  all  three  samples,  group  VII  elements  such  as 
Br,  Cl,  and  F  were  detected  at  the  interface  with  the  ZnTe 
buffer  layer.  Br  and  Cl  could  be  introduced  during  the  sub¬ 
strate  preparation  process  and  could  serve  as  n-type  dopants 


if  they  substituted  group  VI  Se  lattice  sites.  C  and  O  were 
detected  in  a  region  approximately  1.5  /un  thick  at  the  sur¬ 
face  of  the  MCS  and  at  the  interface  between  MCS  and 
ZnTe.  The  source  of  these  impurities  and  whether  they  are 
electrically  active  in  MCS  needs  to  be  established.  Two 
other  contaminants  listed  in  the  Cd  source  material  certifi¬ 
cate  of  analysis  were  group  VI  S  and  group  IV  Si,  both 
of  which  were  detected  in  all  samples  but  significantly 
reduced  in  the  HgSe  sample  where  the  Cd  source  was  not 
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Fig.  8.  (Color  online)  77  K  electron  concentration  both  as-grown  and  after 
annealing  under  various  overpressures  of  Hg,  Se,  Cd,  and  under  vacuum. 
All  anneals  were  performed  in  quartz  ampoules  pumped  down  to  ~10  5 
Torr,  then  sealed  and  kept  in  a  furnace  at  250  °C  for  24  h,  followed  by  a  3  h 
cool-down.  *x-value  listed  is  prior  to  annealing. 


Fig.  9.  (Color  online)  SIMS  results  of  a  MCS  sample  grown  with  the  effu¬ 
sion  cell  (thickness  =  7.3  /un),  MCS  sample  grown  with  the  cracker  source 
(thickness  =  3.5  /un),  and  an  HgSe  sample  grown  with  the  effusion  cell 
(thickness  =  3.4  /un)  for  (a)  carbon  (b)  oxygen  and  (c)  bromine.  SIMS  meas¬ 
urements  were  performed  by  the  Charles  Evans  Analytical  Group. 


62 


used — strongly  suggesting  they  are  contaminants  in  the  Cd 
source  material. 

Unfortunately  none  of  the  SIMS  measurements  to  date 
have  differed  significantly  between  the  MCS  samples  grown 
with  the  5N  and  6N  Se  source  material,  and  so  the  impurities 
that  were  reduced  by  switching  to  higher  purity  Se  source 
material  have  yet  to  be  identified. 

IV.  SUMMARY  AND  CONCLUSIONS 

MCS  samples  were  grown  via  MBE  on  ZnTe/Si  sub¬ 
strates  using  two  different  Se  sources:  an  effusion  cell  loaded 
with  5N  source  material  that  produced  a  predominantly  Se6 
beam  and  a  cracker  source  loaded  with  6N  source  material 
that  could  be  varied  to  study  other  Se  polyatomic  species. 
Samples  grown  with  the  Se2  had  greater  x-values  with  lower 
Cd/Se  BEP  ratios,  suggesting  greater  Cd  incorporation  with 
Se2.  The  growth  rate  began  to  decrease  when  the  substrate 
temperature  was  raised  above  ~130  °C  under  an  Se6  flux  and 
^150  °C  under  an  Se2  flux.  The  optimal  substrate  tempera¬ 
ture  for  MCS  grown  with  the  effusion  cell  was  found  to  be 
~100°C  for  an  Hg  BEP  of  2.5  x  10-4  Torr — lower  than  the 
optimal  temperature  for  MCT  growth  with  a  similar  Hg 
overpressure  (~  185  °C). 

Electron  concentrations  remained  high  even  at  low  tem¬ 
peratures,  with  as-grown  12  K  concentrations  ranging  from 
1017  to  1018  cm-3  for  samples  grown  with  5N  Se  source  ma¬ 
terial  and  1016-1017cm-3  for  samples  grown  with  6N  Se. 
Impurities  can  produce  energy  levels  located  in  the  conduc¬ 
tion  band  of  narrow-gap  materials,  such  as  In  dopants  in 
MCT.  As  a  result,  these  impurities  do  not  freeze-out  at  lower 
temperatures  and  the  concentration  remains  high  even  at 
temperatures  as  low  as  4  K. 10  The  fact  that  the  electron  con¬ 
centration  remains  high  in  MCS  even  at  low  temperatures 
indicates  the  presence  of  energy  levels  in  the  conduction 
band  similar  to  MCT,  and  the  fact  that  the  12  K  concentra¬ 
tion  was  lower  for  6N  Se  strongly  suggests  that  impurities 
are  introduced  from  contaminants  in  the  Se  source  material. 

SIMS  measurements  detected  impurities  which  could  be 
acting  as  donors,  the  most  prevalent  of  which  was  C.  Br  and 
Cl  were  detected  at  the  MCS/ZnTe  interface,  suggesting 
they  could  be  introduced  by  the  substrate  preparation  pro¬ 
cess.  Significant  levels  of  C  and  O  were  detected  at  the 
MCS/ZnTe  interface  and  in  the  top  1.5  jum  of  the  MCS  layer 
from  the  surface.  Further  measurements  are  required  to 
determine  how  these  impurities  are  introduced,  whether  they 
are  electrically  active,  and  how  they  can  be  eliminated. 


The  MCS  electron  concentration  could  also  be  changed 
by  postgrowth  annealing.  Anneals  under  Hg  and  Cd  over¬ 
pressures  raised  the  electron  concentration,  while  anneals 
under  Se  or  vacuum  lowered  the  electron  concentration.  This 
would  suggest  the  presence  of  native  defects  such  as  intersti¬ 
tials  and  vacancies  in  addition  to  the  background  impurities. 
The  identity  of  these  native  defects  and  an  annealing  process 
to  eliminate  them  is  currently  under  investigation. 

If  MCS  is  to  be  used  for  LWIR  applications,  the  back¬ 
ground  electron  concentration  needs  to  be  reduced  to  at  most 
^1015cm-3(assuming  a  similar  lifetime  to  MCT).  Switching 
from  5N  Se  to  6N  Se  reduced  the  electron  concentration 
from  1017-1018cm-3  to  3-5  x  1016cm-3,  suggesting  that 
the  background  concentration  could  be  further  reduced  by 
using  7  N  or  higher  purity  Se  source  material.  Further  study 
of  native  defects  present  in  MCS  is  required  so  that  a  process 
for  removing  them  through  postgrowth  annealing  can  be 
optimized.  Once  the  background  electron  concentration  has 
been  fully  minimized,  p-type  doping  of  MCS  can  be  devel¬ 
oped  so  that  MCS  device  layers  can  be  produced. 
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Abstract — The  quantum  efficiency  (QE)  of  a  quantum  well  in¬ 
frared  photodetector  (QWIP)  is  historically  difficult  to  predict  and 
optimize.  This  difficulty  is  due  to  the  lack  of  a  quantitative  model  to 
calculate  QE  for  a  given  detector  structure.  In  this  paper,  we  found 
that  by  expressing  QE  in  terms  of  a  volumetric  integral  of  the  ver¬ 
tical  electric  field,  the  QE  can  be  readily  evaluated  using  a  finite 
element  electromagnetic  solver.  We  applied  this  model  to  all  known 
QWIP  structures  in  the  literature  and  found  good  agreement  with 
experiment  in  all  cases.  Furthermore,  the  model  agrees  with  other 
theoretical  solutions,  such  as  the  classical  solution  and  the  modal 
transmission-line  solution  when  they  are  available.  Therefore,  we 
have  established  the  validity  of  this  model,  and  it  can  now  be  used 
to  design  new  detector  structures  with  the  potential  to  greatly  im¬ 
prove  the  detector  QE. 

Index  Terms — Electromagnetic  field  modeling,  infrared  detector, 
quantum  efficiency  (QE). 


I.  Introduction 

AWELL-known  property  of  quantum  well  infrared  pho¬ 
todetector  (QWIP)  materials  is  their  lack  of  optical  ab¬ 
sorption  under  normal  incident  condition.  Consequently,  each 
detector  in  an  array  is  outfitted  with  a  light-coupling  structure 
for  detection.  Although  the  coupling  designs  are  usually  guided 
by  certain  physical  principles,  their  exact  quantum  efficiencies 
(QEs)  are  not  always  predictable.  In  the  cases  where  analyt¬ 
ical  models  do  exist,  they  inevitably  contain  assumptions  or 
simplifications,  such  as  infinite  detector  sizes  or  infinite  metal 
conductivities,  that  render  the  predictions  inaccurate.  The  lack 
of  a  quantitative  model  has  thus  far  prevented  the  QE  improve¬ 
ment,  and  as  a  result,  the  QWIP  technology  has  generally  been 
regarded  as  a  low  QE  technology. 

One  attempt  in  the  past  to  yield  a  quantitative  prediction 
is  through  rigorous  electromagnetic  (EM)  modeling  [l]-[4], 
but  the  success  was  rather  limited.  Recently,  we  showed  that 
by  expressing  QE  as  an  integral  of  the  vertical  electric  field 
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Ez ,  its  value  can  be  readily  and  reliably  computed  by  a  com¬ 
mercial  finite  element  EM  solver  [5].  We  applied  this  model 
to  several  detector  structures  and  obtained  quantitative  agree¬ 
ments  with  experiments.  In  this  paper,  we  expand  this  study 
to  include  all  known  geometries  in  the  literature  and  report 
its  finding.  For  completeness,  we  also  report  those  described 
in  [5].  The  structures  surveyed  in  this  paper  are:  edge- 
coupled  QWIPs,  linear-  and  cross-grating  QWIPs,  random¬ 
grating  QWIPs,  corrugated-QWIPs  of  prism  and  pyramidal 
geometries,  enhanced- QWIPs,  quantum  grid  infrared  photode¬ 
tectors,  plasmonic-enhanced  QWIPs,  and  photonic-crystal- slab 
QWIPs.  In  addition,  we  also  compare  the  numerical  solu¬ 
tions  with  the  analytical  classical  solutions  in  the  cases  of 
edge-coupled  detectors  and  corrugated  QWIPs  and  the  modal 
transmission-line  solutions  in  the  case  of  quantum  grid  infrared 
photodetectors.  The  agreements  turn  out  to  be  satisfactory  in  all 
examples.  With  a  verified  model,  we  use  it  to  design  and  op¬ 
timize  new  detectors.  The  result  shows  that  the  theoretical  QE 
can  reach  70-80%  in  some  cases  without  an  antireflection  (AR) 
coating.  Therefore,  QWIPs  can  have  a  potential  for  high  QE. 


II.  EM  Model 

Previously,  we  have  established  that  by  performing  finite- 
element  EM  computation  to  the  following  expression,  the  ab¬ 
sorption  QE,  labeled  as  r\,  of  any  detector  geometry  can  be 
predicted  [5]: 


T]  = 


na 

AE2 


(1) 


where  n  is  the  material  refractive  index  of  the  detector  material, 
a  is  the  absorption  coefficient  for  vertically  polarized  light,  A  is 
the  detector  area,  Eq  is  the  incident  electric  field  from  the  air,  V 
is  the  detector  active  volume,  Ez  is  the  self-consistent  vertical 
electric  field.  Equation  (1)  states  that  QE  can  be  calculated  from 
the  volume  integral  of  \EZ  |2  in  the  presence  of  a  finite  a. 

Since  Eq  and  Ez  are  linearly  proportional  to  each  other,  Eq 
can  be  set  arbitrarily,  and  the  only  input  parameter  in  (1)  is  the 
wavelength-dependent  a(X),  which  can  be  calculated  based  on 
the  material  layer  structure  [6].  For  a  known  a  (A),  there  will 
be  no  more  free  parameters,  and  the  value  of  rj(X)  is  uniquely 
and  unambiguously  determined.  To  solve  Ez  numerically,  we 
use  a  commercial  finite  element  solver.  In  addition  to  rj,  we  also 
define  another  quantity,  the  external  QE  or  7/ext ,  which  is  QE  x 
pixel  area  fill  factor  (=AMpitch). 
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Fig.  1.  The  figure  shows  a  6  —  45°  edge-coupled  QWIP.  The  figure  also 
shows  \E\  and  Ez  distributions  obtained  from  EM  modeling  at  A  =  10  /i m  with 
Eq  =  377  V/m. 

III.  EM  Solutions 

To  assess  the  reliability  of  our  model,  we  first  apply  (1)  to  a 
light-coupling  scheme  that  has  a  classical  solution.  It  is  the  edge 
coupling  via  a  45°  polished  facet  [7].  Fig.  1  shows  the  detector 
geometry. 

The  classical  solution  rjc  for  this  geometry  is 

pnc 45° 

%(45°)  =  Tsub  — [1  -  exp  (— a(45°)I/(45°))]  (2) 

where  Tsu b  =  4n/(l  +  n)2  is  the  transmission  coefficient  of  the 
GaAs  substrate;  ct(45°)  =  asin245°  is  the  material  absorption 
coefficient  at  45°;  L(45°)  =  2//cos(45°)  is  the  optical  pathlength 
inside  the  detector  for  two  passes  of  light,  and  t  is  the  active 
material  thickness.  In  (2),  the  first  cos (45°)  accounts  for  the 
smaller  projected  detector  area  in  the  direction  of  light  and  the 
factor  \  accounts  for  the  fact  that  only  half  of  the  light,  the 
transverse  magnetic  (TM)  mode,  is  coupled.  In  this  classical 
model,  if  a  is  constant  with  respect  to  wavelength,  r\c  will  also 
be  wavelength  independent.  For  a  typical  a  of  0.15  /im-1  and  t 
of  3  /im,  tjc (45°)  can  be  conveniently  calculated  to  be  12.3%. 

Since  under  the  present  detector  geometry,  the  direction  per¬ 
pendicular  to  the  plane  of  incidence  is  invariant,  we  reduce  (1) 
to  2-D,  in  which 

V{X)=2 If  X^M)|Vr  (3> 

where  the  factor  \  accounts  for  one  coupled  polarization  (the 
TM  mode),  d  =  100  fi m  is  the  assumed  detector  linear  dimension 
in  the  horizontal  direction,  Eq  =  377  V/m,  and  X  =  d  x  t  is 
the  detector  cross-sectional  area.  The  present  example  consists 
of  a  3-/im  active  layer  on  top  of  a  GaAs  substrate,  a  l-/im 
top  GaAs  contact  layer,  and  a  \-fim  gold  contact  layer.  The 
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Fig.  2.  The  figure  shows  the  calculated  QE  as  a  function  of  incident  A  at  three 
different  edge  angles.  The  dashed  line  shows  the  classical  value  at  45°. 


properties  of  gold  are  represented  by  a  wavelength-dependent 
complex  refractive  index.  The  value  of  a  is  again  taken  to  be 
0.15  /im-1,  independent  of  X.  Although  a  is  a  constant  in  this 
model,  the  computed  Ez  distribution  and,  thus,  77  nevertheless 
are  X-dependent  because  of  the  wave  nature  of  light.  The  detector 
is  placed  near  the  edge  of  a  polished  GaAs  substrate,  and  the 
length  scale  is  indicated  in  Fig.  1(b). 

The  color  plot  in  Fig.  1(a)  shows  the  absolute  magnitude  of 
the  total  E  field.  Most  of  the  detector  active  region  is  found  to 
be  uniformly  illuminated  at  this  angle  except  those  near  the  cor¬ 
ners.  On  the  left,  the  circled  region  is  shadowed  by  the  substrate 
in  the  front,  which  prevents  the  light  incident  directly  into  the 
detector.  On  the  right,  the  reflection  surfaces  at  the  top,  at  the 
side,  and  at  the  substrate,  form  a  Fabry-Perot  etalon  and  pro¬ 
duce  the  rapidly  varying  intensity.  The  Ez  component,  which  is 
responsible  for  absorption,  is  plotted  in  Fig.  1(b).  Due  to  the  in¬ 
terference  between  the  incident  light  and  the  reflected  light  from 
the  top  surface,  a  standing  wave  of  local  maxima  and  minima  is 
established. 

By  integrating  \EZ  |2  within  the  active  cross  section  according 
to  (3),  77  can  be  evaluated.  The  result  is  shown  in  Fig.  2,  along 
with  two  other  edge  angles.  The  QEs  only  weakly  depend  on 
X,  partially  validating  the  classical  assumption.  At  45°,  77  varies 
with  two  distinct  frequencies.  The  slow  variation  is  due  to  the 
gradual  shift  of  the  standing  wave  along  the  vertical  axis  as 
X  changes.  It  is  centered  around  ~  12.3%,  in  agreement  with 
the  classical  model.  The  higher  frequency  oscillations  are  the 
Fabry-Perot  oscillations  produced  at  the  right-hand  corner.  This 
example  shows  that  the  EM  model,  being  a  numerical  solution 
to  Maxwell  equations,  is  not  only  consistent  with  the  classical 
model  but  also  accounts  for  all  other  optical  effects  neglected  in 
the  classical  model  [7]. 

For  3-D  modeling,  we  first  examine  the  external  QE  of  a  linear 
grating  that  was  used  in  a  polarization  detection  experiment  [8]. 
The  detector  area  A  is  18.6  x  18.6  /im2  and  the  pixel  pitch 
area  Apitch  is  20  x  20  /im2.  The  detector  structure  is  shown 
in  Fig.  3(a).  The  calculated  Ez  distribution  at  the  center  cross 
section  is  shown  in  Fig.  3(b)  for  Eq  =  377  V/m  perpendicu¬ 
lar  to  the  grating  lines.  The  corresponding  (unpolarized)  77ext 
is  shown  in  Fig.  3(c)  based  on  the  a  spectrum  calculated  from 
the  material  structure.  The  theoretical  peak  77ext  is  14.1%  for 
this  optical  polarization.  Experimentally,  the  peak  responsivity 
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Fig.  3.  (a)  Grating  having  2.1 -fim  grating  period  and  0.68-yum  depth,  (b)  Ez 

distribution  at  the  center  cross  section  at  A  =  8.3  fim.  (c)  Calculated  (dashed 
curve)  and  measured  QE  (solid  curve). 


R  for  this  polarization  was  measured  to  be  0.48  AAV.  With  the 
reported  photoconductive  gain  g  of  0.57,  the  deduced  gext  is 
12.5%,  which  is  only  slightly  lower  than  the  prediction  by  a 
factor  of  1.12.  Besides,  the  agreement  in  the  QE  magnitude, 
the  calculated  lineshape  also  matches  well  with  the  measured 
spectrum  as  shown  in  Fig.  3(c).  Therefore,  the  EM  model  suc¬ 
cessfully  explains  the  light-coupling  characteristics  of  a  linear 
grating. 

Fig.  4(a)  shows  the  next  example  of  a  cross  grating  [9].  This 
grating-QWIP  consists  of  a  1.5-gm  active  QWIP  material,  a 
1.5 -/Am  top  contact  layer,  a  1.5 -/Am  bottom  common  contact 
layer,  and  a  0.1  -gm  etch  stop  layer.  The  pixel  pitch  is  25  g m. 
To  model  the  experimentally  realized  structure,  we  set  the  pixel 
dimension  to  be  25  gm  at  the  base  and  23  gm  at  the  mesa 
top.  The  square  frame  around  the  grating  grid  has  a  height  of 
0.6  gm  and  a  width  of  2  gm  at  the  top.  Inside  the  frame,  the 
grating  height  is  0.4  gm  and  the  grating  period  is  4.0  gm.  The 
widths  of  the  base  and  tip  of  the  grid  lines  are  0.9  and  0.3  gm, 
respectively.  Fig.  4(b)  shows  the  center  cross  section  of  the 
modeled  structure  without  the  top  metal  cover  layer.  This  top 
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Fig.  4.  (a)  Experimental  grating  structure,  (b)  Modeled  structure  and  the  Ez 

distribution  in  the  center  cross  section  at  A.  =  11.0  fim.  (c)  Experimental  data 
scaled  by  a  factor  of  1.2  (solid  curve  with  circles),  the  calculated  QE  of  the 
grating  (dashed  curve),  and  the  calculated  QE  without  the  grid  lines  (solid 
curve). 


metal  cover  is  replaced  by  the  perfect  electric  conductor  (PEC) 
boundary  condition  in  the  model.  The  calculated  peak  a  of  this 
material  is  0.10  gm~l  at  11.3  gm,  and  the  50%  cutoff  is  at 
12.1  gm. 

Experimentally,  the  peak  conversion  efficiency  CE  (=77  x  g) 
is  measured  to  be  2.41%  at  0.78  V.  Together  with  an  estimated 
g  =  0.56,  which  is  scaled  from  a  60-period  structure,  the  external 
QE  is  4.31%.  This  experimental  QE  is  20%  lower  than  the  5.0% 
predicted  from  the  EM  model  shown  in  Fig.  4(c).  By  multiplying 
the  experimental  value  with  a  factor  of  1 .2,  a  good  match  of  the 
spectral  lineshape  is  obtained  below  12  gm.  The  theoretical  QE 
beyond  12  gm  is  limited  by  the  assumed  material  absorption, 
which  cuts  off  at  12.1  gm.  Overall,  the  present  EM  model  is 
adequate  in  explaining  the  grating  efficiency  after  taking  the 
detailed  detector  structure  into  account.  As  seen  in  Fig.  4(a),  the 
present  mesa  has  substantially  inclined  sidewalls.  By  repeating 
the  calculation  without  the  grid  lines,  we  found  the  sidewall 
reflection  contributes  to  about  33%  of  the  QE  near  the  peak 
wavelengths. 

Fig.  5(a)  shows  the  geometry  of  a  random  grating  invented 
by  Brill  and  Sarusi  [10],  [11].  The  pixel  size  is  28  gm  x  28  gm 
and  the  grating  height  is  0.67  gm.  The  thickness  of  the  active 
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Fig.  5.  (a)  Random  grating  QWIR  (b)  Ez  distribution  at  the  center  cross 

section  at  A.  =  8.7  yum.  (c)  Calculated  (dashed  curve)  and  measured  QE  (solid 
curve). 


material  is  3.28  /a m  and  the  thickness  of  the  bottom  contact 
is  0.95  /Am.  The  substrate  is  completely  removed  and  an  AR- 
coating  is  applied  to  the  detector.  Fig.  5(b)  shows  the  layer 
structure.  The  QWIP  material  contains  50  periods  of  55-nm- 
thick  AlGaAs  barriers  and  4.9-nm-thick  quantum  wells  doped 
to  5  x  1017  cm-3,  with  which  the  peak  a  is  calculated  to  be 
0.104  /Am-1.  Experimentally,  the  peak  detector  responsivity  is 
measured  to  be  0.6  AAV,  at  which  bias,  the  gain  is  0.32.  The 
measured  QE  is  thus  23.4%.  On  the  other  hand,  the  calculated 
QE  is  27.9%,  which  is  20%  larger  than  the  experimental  value. 
Fig.  5(c)  shows  the  calculated  and  the  measured  lineshapes, 
which  are  in  satisfactory  agreement. 

Fig.  6(a)  shows  another  grid  structure,  which  is  known  as 
the  enhanced-QWIP  [12].  But  different  from  a  cross  grating, 
the  active  detector  material  in  this  case  is  etched  to  form  the 
grid  structure  and  the  radiation  is  incident  directly  onto  the  grid. 
Fig.  6(a),  (b),  and  (c)  show  the  detector  structure,  the  Ez  field 
distribution  at  A.  of  9.2  /Am,  and  the  calculated  QE  for  a  constant 
a  of  0.21  /Am-1 ,  respectively.  This  value  of  a  is  calculated  from 
the  material  structure  at  the  absorption  peak.  Fig.  6(c)  also  plots 
the  measured  QE  of  two  different  detectors  at  their  peaks.  The 
theory  and  experiment  are  in  agreement  with  each  other. 

Similar  to  the  enhanced-QWIP,  the  structure  of  a  quantum 
grid  infrared  photodetector  (QGIP)  is  shown  in  Fig.  7.  It  consists 
of  a  linear  array  of  grid  lines  of  active  materials.  In  the  modeled 
structure,  the  top  gold  layer  thickness  tm  is  0.2  /Am,  the  top 
contact  layer  thickness  tc  is  0.1  /Am,  the  active  layer  thickness 
ta  is  1.18  /Am,  the  bottom  contact  layer  thickness  4  is  1.83  /Am, 
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Fig.  6.  (a)  E-QWIP  geometry,  (b)  Color  plot  of  Ez  distributions  at  9.2-yum 

incident  wavelength.  The  displayed  plane  is  located  at  the  center  the  grid  layer. 
The  grid  has  a  period  of  8  /im,  a  strip  width  of  1.2  /im,  and  a  strip  height 
of  1.43  yum.  The  back  of  the  grid  is  coated  with  a  0.4-/im  layer  of  gold,  (c) 
Calculated  (curve)  and  the  observed  QEs  (squares)  are  shown. 


and  the  separation  among  the  grid  lines  s  is  4.65  /Am.  The  width 
of  the  grid  line  w,  which  determines  the  detection  wavelength  Xp 
of  the  detector,  varies  among  different  detectors.  The  substrate 
is  assumed  to  be  thick  such  that  the  light  enters  into  the  detector 
from  the  substrate  side  classically  with  a  transmission  coefficient 

^sub- 

Previously,  Xp  of  the  detector  had  been  designed  using  the 
2-D  modal  transmission-line  (MTL)  method  [13].  Fig.  8  shows 
that  by  choosing  w  appropriately,  the  detectors  can  be  made  to 
detect  at  each  integral  wavelengths  from  8  to  15  /Am.  In  that 
modeling,  a  constant  relative  dielectric  constant  of  9.722  +  i 
had  been  adopted  for  the  active  layer,  and  the  refractive  in¬ 
dex  of  GaAs  was  set  to  be  3.118.  Based  on  the  present  finite 
element  method  (FEM),  we  repeat  the  same  calculation  using 
the  identical  detector  parameters.  Fig.  8  shows  that  the  present 
FEM  solution  agrees  closely  with  the  previous  MTL  solution, 
confirming  both  methods.  The  small  differences  in  the  shorter 
wavelengths  could  be  due  to  the  truncation  of  certain  infinite 
Fourier  series  in  the  MTL  approach. 
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Fig.  7.  (a)  Top  view  of  a  QGIP.  The  numbers  are  dimensions  in  microns,  (b) 

Side  view. 
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Fig.  8.  QE  spectra  calculated  based  on  (a)  FEM  (solid  curves)  and  (b)  MTL 
method  (dashed  curves)  having  the  same  detector  parameters.  The  numbers  are 
w  in  microns. 


In  Fig.  9,  we  plot  the  experimental  coupling  efficiency  of 
two  QGIP  detector  elements  [13],  which  is  defined  as  the  ratio 
of  the  responsivities  of  the  grid  and  the  edge-coupled  detec¬ 
tor.  This  ratio  cancels  out  the  material  a  dependence  and  it  is 
directly  proportional  to  the  detector  QE  with  a  constant  a.  In 
Fig.  9,  we  also  show  the  theoretical  QE  with  a  constant  a  of 
0.20  fim~l  and  n  =  3.239  using  the  FEM  model.  Since  the 
substrate  of  these  detectors  is  about  200-/im  thick,  the  classical 
substrate  transmission  is  applicable  in  this  calculation.  The  cal¬ 
culated  spectrum  in  Fig.  9  explains  the  overall  lineshape  in  the 
experiment,  although  one  cannot  compare  the  absolute  QE  in 
this  plot.  Note  that  the  experimental  spectrum  is  averaged  over 
180  narrow  and  long  (400-/im)  grid  lines.  The  width  fluctua¬ 
tions  along  the  grid  lines  are  expected  to  broaden  the  QE  peak 
predicted  by  the  theory.  Therefore,  the  present  EM  modeling  is 
also  applicable  to  the  QGIP  structure. 

In  the  previous  MTL  analysis,  it  was  determined  with  a  large 
s  =  4.65  fim,  the  diffraction  effect  among  the  grid  lines  is 
small  and  the  metal  contact  on  top  of  each  grid  line  serves  as 
a  half-wave  antenna.  An  absorption  peak  will  occur  whenever 
the  incident  A  ~  2nw.  To  verify  this  conclusion,  we  plot  the 
Ez  distribution  at  Xp  =  11.1  fim  of  the  w  =  1.8  fin i  detector 
in  Fig.  10.  The  dipole  scattering  field  distribution  inside  the 
grid  lines  is  evident  in  this  plot,  which  validates  the  previous 
conclusion. 
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Fig.  9.  Measured  (solid  curve)  coupling  efficiency  and  the  calculated  QE 
(dashed  curve)  for  (a)  w  =  1.80  n m  and  (b)  w  =  2.33  pm.  The  value  of  w  is 
measured  using  scanning  electron  microscope. 
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Fig.  10.  Ez  distribution  based  on  FEM  with  Eq  —  377  V/m. 


Fig.  1 1  shows  the  3-D  geometry  and  the  Ez  distributions  of 
a  prism- shaped  corrugated  QWIP.  This  detector  geometry  uses 
optical  reflection  at  the  angled  sidewalls  to  create  the  needed 
Ez.  This  detector  geometry  also  accepts  a  classical  solution 
for  QE  based  on  ray  optics  [15].  Fig.  12  shows  the  classical 
solution,  the  rigorous  EM  solution,  and  the  average  experimental 
QE  spectra  for  two  focal  plane  arrays  (FPAs)  having  different 
cutoff  wavelengths.  Fig.  12(a)  is  for  a  QWIP  containing  60 
periods  of  700- A  Alo.i66Gao.834  As  and  60- A  GaAs.  This  active 
material  is  placed  in  the  middle  of  the  corrugation  and  the  rest 
of  the  volume  is  filled  with  contact  materials.  The  array  is  not 
antireflection  (AR)  coated.  The  calculated  a  spectrum  has  a 
peak  at  A  =  11.9  fim  with  a  value  of  0.105  /im_1  and  a  50% 
cutoff  at  A  =  12.7  fim.  Fig.  12(b)  is  from  another  QWIP  that  is 
made  of  60  periods  of  700-A  Alo.23Gao.73As  and  48-A  GaAs. 
The  calculated  a  is  peaked  at  A  =  8.7  fim  with  a  value  of 
0.145  fim~l .  Fig.  12  again  shows  the  agreement  among  the  two 
theoretical  models  and  the  experimental  data  in  terms  of  the 
spectral  lineshape  and  the  absolute  magnitude.  Accounting  for 
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Fig.  13.  Detector  geometry  and  the  Ez  distribution  at  X  =  8.8  /im  where 
Eq  =  311^/2  V/m.  This  detector  is  glued  to  a  piece  of  silicon  using  epoxy. 


Fig.  11.  Detector  geometry,  which  is  without  an  antireflection  (AR)  coating. 
The  Ez  distribution  is  shown  at  A,  =  11.2  /im  with  Eq  —  311^/2  V/m. 


Fig  .12.  Figure  shows  the  calculated  and  measured  external  QE  of  two  detector 
materials  without  an  AR-coating. 


the  optical  interference,  the  EM  model  is  better  equipped  than 
the  classical  model  in  describing  the  QE  oscillations. 

Fig.  13  shows  the  geometry  of  a  pyramidal  C-QWIP,  which 
is  AR  coated  (ARC).  This  detector  geometry  has  four  angled 
sidewalls  to  reflect  light.  As  shown  in  Fig.  14(a),  both  the  clas¬ 
sical  [15]  and  the  EM  models  predict  the  peak  QE  correctly. 
The  large  discrepancy  in  the  spectrum  around  8  fim  is  due  to 
the  known  epoxy  glue  absorption  used  in  the  FPA  integration. 


Fig .  1 4 .  (a)  Calculated  and  measured  external  QE  spectra  of  a  pyramid-  shaped 

C-QWIP  FPA.  (b)  Infrared  image  taken  by  the  corresponding  1-MP  FPA. 


The  infrared  image  in  Fig.  14(b)  was  taken  by  the  corresponding 
1 -megapixel  FPA  camera. 
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Fig.  15.  (a)  Plasmonic-enhanced  QWIP  structure,  (b)  Calculated  Ez  at  k  = 

8.3  fim  with  Eq  =  377^2  V/m. 


Fig.  17.  Cross  sections  of  the  PCS-QWIP  and  the  calculated  Ez  at  one  of  the 
sharp  peaks  with  k  =  6.44  /im  or  v  =  1552  cm”1 . 
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Fig.  18.  Measured  photocurrent  spectrum  in  arbitrary  unit  (solid  curve),  and 
the  calculated  QE  spectrum  based  on  the  displayed  a  spectrum. 


Fig.  16.  Measured  (solid  curve)  and  the  calculated  QE  with  a  constant  a  = 
0.05  /im”1  (dashed  curve)  of  a  plasmonic  enhanced  QWIP. 

Fig.  15(a)  shows  the  top  view  of  a  plasmonic  enhanced  QWIP 
studied  by  Wu  et  al.  [16].  In  this  structure,  a  400-A-thick  gold 
film  perforated  with  circular  holes  is  deposited  on  a  0.528  fi m- 
thick  InGaAs/InP  active  material.  The  spacing  between  two 
holes  is  2.9  fim  and  their  diameter  is  1.4  fim .  The  active  ma¬ 
terial  has  a  low  doping  such  that  the  peak  a  is  calculated  to 
be  0.05  /im-1.  The  modeled  Ez  distribution  at  0.25  fim  below 
the  gold  film  is  shown  in  Fig.  15(b).  From  the  Ez  distribution, 
the  calculated  QE  for  this  constant  a  is  shown  in  Fig.  16.  It  is 
peaked  at  k  =  8.3  fim.  Meanwhile,  from  the  measured  R  spec¬ 
trum  [16]  and  the  estimated  gain  of  8.5  from  a  similar  detector 
structure  [17],  the  experimental  QE  is  deduced  to  be  12.6%, 
which  agrees  with  the  theory  to  within  10%,  and  the  two  spectra 
have  similar  lineshapes. 

Fig.  17  shows  the  cross  sections  of  a  photonic-crystal-slab- 
QWIP  (PCS-QWIP)  studied  by  Kalchmair  et  al.  [18].  The  nom¬ 


inal  structure  consists  of  an  array  of  air  holes  with  hole  spacing 
a  =  3.1  fim  and  hole  diameter  d—  1.24  fim  etched  through  the 
active  and  contact  materials.  The  active  material  thickness  ta 
is  1.5  fim ,  and  the  bottom  contact  thickness  tc  is  0.5  fim.  The 
PCS  is  suspended  in  the  air  at  a  nominal  height  ta ir  =2.0  fim 
above  the  GaAs  substrate.  The  measured  photocurrent  spectrum 
is  shown  in  Fig.  18.  Based  on  the  material  a  spectrum  shown  in 
Fig.  18,  which  is  deduced  from  the  measured  photoresponse  of 
the  edge-coupled  detector  [18],  the  QE  spectrum  is  calculated 
and  it  indeed  contains  the  characteristic  sharp  peaks.  However, 
the  calculated  sharp  peaks  do  not  align  exactly  with  the  measure¬ 
ment.  To  obtain  a  better  alignment  as  that  shown  in  Fig.  18,  the 
theoretical  parameter  a  is  reduced  slightly  from  3.1  to  2.9  fim. 
This  discrepancy  could  be  due  to  experimental  uncertainties  or 
theoretical  assumption  of  an  average  n  of  3.239  over  a  wide 
range  of  wavelengths.  In  reality,  the  value  of  n  varies  from  3.34 
at  4  fim  to  3.04  at  11  fim.  In  addition  to  the  adjusted  a,  the 
magnitudes  of  these  peaks  depend  weakly  on  t air .  To  obtain  a 
larger  peak  at  position  A  in  Fig.  18,  ta\v  is  adjusted  from  2.0  to 
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Fig.  19.  Absorption  coefficient  a  assumed  in  the  EM  modeling. 

0.9  pm.  A  different  ta ir  could  be  due  to  the  sagging  of  the  PCS 
in  the  air  at  the  operating  temperature.  From  the  modeling,  the 
sharpness  of  these  peaks  is  caused  by  two  factors.  One  is  the 
nearly  symmetrical  detector  structure,  both  in  vertical  and  hor¬ 
izontal  directions,  that  induces  strong  resonances.  The  second 
is  the  weak  material  absorption  at  the  peak  wavelengths,  which 
introduces  only  small  damping  effects  on  the  resonances.  The 
close  match  of  the  main  peak  and  some  of  the  side  peaks  lends 
support  for  the  present  modeling  approach. 

IV.  EM  Design 

After  the  EM  model  is  verified  by  the  existing  experiments, 
we  can  use  it  for  detector  design.  In  the  past,  the  design  of  a  light¬ 
coupling  structure  has  been  focused  mainly  on  the  diffractive 
element  (DE)  placed  on  top  of  the  detector.  The  size,  thickness, 
and  shape  of  the  detector  were  not  part  of  the  consideration. 
The  present  modeling  instead  allows  the  design  of  the  DE  and 
the  detector  volume  as  one  integral  light-coupling  entity.  We 
found  that  the  detector  volume  actually  plays  a  crucial  role  in 
determining  QE,  in  which  it  acts  as  a  resonant  cavity  to  the  light 
diffracted  from  the  DE.  With  the  versatility  of  the  finite  element 
method,  one  is  also  able  to  consider  a  much  wider  variety  of 
DEs  whose  patterns  can  be  far  more  complex  than  that  of  a 
regular  grating.  In  general,  a  DE  can  be  in  the  form  of  a  pho¬ 
tonic  Bravais  lattice  with  a  basis  or  in  the  form  of  irregularly 
distributed  scatterers.  The  basis  and  scatterers  can  be  of  any  3-D 
geometrical  objects.  The  opening  up  of  these  arbitrary  patterns 
offers  tremendous  choices  of  QE  characteristics  both  in  spec¬ 
tral  lineshape  and  in  absolute  magnitude.  This  versatility  in  the 
detector  geometrical  design  adds  to  the  well-known  versatility 
in  the  QWIP  material  design.  The  combination  of  the  two  will 
yield  a  tremendous  flexibility  in  designing  the  specific  detector 
optical  properties.  The  integration  of  a  DE  and  a  resonant  cavity 
is  referred  to  as  the  resonator  QWIP  [19],  or  the  R-QWIP.  With 
different  DE  designs  to  suit  different  applications,  there  will 
be  different  types  of  R-QWIPs.  We  have  since  studied  a  large 
number  of  these  detector  designs  and  obtained  a  wide  range  of 
coupling  characteristics.  Here,  we  describe  two  of  the  simplest 
designs  for  illustration  purposes:  one  is  the  grating-resonator- 
QWIP  or  GR-QWIP  and  another  is  the  ring-resonator-QWIP  or 
RR-QWIP. 
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Fig.  20.  Calculated  QE  for  different  detector  size  p  for  (a)  a  constant  a  of 
0.20  pvcT1  and  (b)  a  varying  a  according  to  Fig.  19. 


First,  we  optimize  a  25 -pm  pitch  GR-QWIP  without  an  AR- 
coating  for  8-9-pm  detection.  The  value  of  a  is  either  assumed 
to  be  constant  at  0.20  pm-1  or  a  narrowband  spectrum  shown 
in  Fig.  19.  The  period  of  the  grating  is  first  set  at  2.7  pm. 
All  the  rest  of  the  parameters,  such  as  the  active  layer  thick¬ 
ness,  the  grating  height,  the  bottom  contact  thickness,  and  the 
pixel  linear  size  p ,  are  adjusted  to  give  the  maximum  QE  in  the 
8-9-pm  band.  Fig.  20  shows  one  of  the  optimizing  procedures 
by  varying  p  alone,  while  all  other  parameters  have  been  opti¬ 
mized.  The  result  shows  that  the  pixel  size  has  a  modest  effect 
in  the  QE  in  this  case.  For  a  constant  a ,  the  maximum  QE  is 
78.6%  achieved  by  ap  =  22-pm  GR-QWIP  at  A  =  8.1  pm. 

The  aforementioned  modeling  shows  that  the  GR-QWIP  is  a 
promising  detector  structure  for  narrowband  detection.  Using  an 
array  of  square  rings  as  the  DE  instead,  the  coupling  bandwidth 
can  be  widened  as  shown  in  Fig.  21(a).  In  this  structure,  the 
outer  dimension  of  each  ring  is  4  pm  and  the  inner  dimension 
is  1 .4  pm.  The  wider  bandwidth  is  beneficial  even  for  a  narrow 
band  material  as  seen  in  Fig.  21(b).  It  reduces  the  spectral  varia¬ 
tions  with  different  p  and  preserves  the  absorption  lineshape  of 
the  material.  The  largest  QE  in  Fig.  19(a)  is  73.1%  achieved  at 
p  =  21.5  pm  and  A  =  9.9  pm. 

Since  these  GR-QWIPs  and  RR-QWIPs  will  be  built  on  a 
very  thin  active  material  layer,  the  photoconductive  gain  can  be 
as  large  as  0.6  at  full  bias.  Therefore,  the  estimated  conversion 
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Fig.  21.  Calculated  QE  for  different  detector  size  p  for  (a)  a  constant  a  of 
0.20  pm~l  and  (b)  a  varying  a. 


Fig.  22.  Calculated  QE  for  different  RR-QWIP  detector  sizes  for  a  varying  a 
according  to  Fig.  19. 


efficiency  is  about  40%,  which  is  adequate  for  many  high-speed 
applications. 

V.  Modeling  of  Imperfections  and  Crosstalks 

The  above  calculations  assumed  that  the  designed  structures 
can  be  produced  faithfully  in  experiment,  which  may  not  always 
be  feasible  technologically  or  economically.  The  present  model 
is  adept  in  determining  the  impacts  of  processing  imperfections 
such  as  size  nonuniformity  and  deformed  DEs  by  modeling  the 
actual  fabricated  structures.  The  subsequent  optimization  can  be 
performed  on  the  realizable  patterns.  Fig.  22  shows  one  of  such 
examples  in  modeling  the  optimized  RR-QWIPs  with  rounded 
ring  corners.  The  result  for  the  23-fim  detector  shows  that  by 
adjusting  other  detector  parameters,  the  rounded  rings  can  have 
very  similar  QE  as  the  square  rings.  Based  on  the  rounded  ring 
structure,  one  can  also  design  efficient  detectors  for  smaller 


Fig.  23.  Calculated  QE  of  the  neighboring  RR-QWIP  pixels  when  a  plane 
wave  is  incident  onto  the  center  pixel.  The  legend  specifies  the  pixel  coordinates. 


pixel  sizes.  As  shown  in  Fig.  22,  the  detector  can  have  57%  QE 
for  13-/im  pixels  and  40%  QE  for  8-/im  pixels. 

EM  modeling  can  also  be  used  to  determine  other  FPA  prop¬ 
erties  such  as  pixel  crosstalk.  For  small  pitch  arrays,  crosstalk 
due  to  pixel  optical  diffraction  is  a  concern.  In  order  to  deter¬ 
mine  the  amount  of  crosstalk,  one  can  evaluate  the  values  of  QE 
of  the  surrounding  pixels  while  only  the  center  pixel  pitch  area 
is  illuminated.  Fig.  23  shows  an  example  for  the  10-/im  pitch 
arrays,  in  which  the  result  for  five  nearest  neighbors  is  plotted. 
From  this  calculation,  the  crosstalk  is  estimated  to  be  less  than 
2.5%  for  the  RR-QWIP  structure. 


VI.  Conclusion 

The  QE  of  a  detector  uniquely  determines  its  sensitivity  under 
background-limited  infrared  performance  condition  and,  hence, 
it  is  a  critical  figure  of  merit  to  consider  for  a  detector  technology. 
QWIPs  possess  many  unique  advantages,  but  historically  suffer 
from  a  low  QE.  This  low  QE  is  due  to  the  lack  of  a  quantitative 
model  to  perform  detector  design  and  optimization.  In  this  paper, 
we  have  established  an  EM  model  of  using  (1)  to  calculate  QE 
explicitly.  This  approach  is  shown  to  be  able  to  provide  a  quanti¬ 
tative  answer  to  any  detector  geometry  in  any  degree  of  desired 
detail.  We  verified  its  accuracy  and  reliability  with  experiments 
and  with  analytical  classical  and  modal  transmission-line  solu¬ 
tions.  With  this  approach,  one  can  now  ascertain  the  optical  cou¬ 
pling  properties  according  to  its  physical  construct.  It  is  also  well 
known  that  the  absorption  properties  of  the  detector  material  can 
be  calculated  accurately  from  its  layer  structure.  With  the  advent 
of  a  rigorous  model  for  the  light-coupling  structure,  the  QWIP 
technology  can  now  enter  into  a  new  era,  in  which  all  of  its  opti¬ 
cal  properties  can  be  engineered  in  precision.  This  feature  will  be 
invaluable  to  FPA  production  and  application.  In  this  paper,  we 
also  optimized  a  grating  resonator  to  achieve  a  high  QE  and  de¬ 
signed  a  ring  resonator  to  broaden  its  coupling  bandwidth.  Other 
coupling  lineshapes  can  also  be  similarly  designed  to  suit  any 
applications. 
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We  present  computational  and  experimental  results  of  dust  particles  that  can  be  tuned  to  preferentially  reflect  or  emit  IR  radiation 
within  the  8-14  pm  band.  The  particles  consist  of  thin  metallic  subwavelength  gratings  patterned  on  the  surface  of  a  simple  quarter 
wavelength  cavity.  This  design  creates  distinct  IR  absorption  resonances  by  combining  the  plasmonic  resonance  of  the  grating  with 
the  natural  resonance  of  the  cavity.  We  show  that  the  resonance  peaks  are  easily  tuned  by  varying  either  the  geometry  of  the  grating 
or  the  thickness  of  the  cavity.  Here,  we  present  a  computational  design  algorithm  along  with  experimental  results  that  validate  the 
design  methodology. 


1.  Introduction 

Most  objects,  either  manmade  or  found  in  nature,  reflect  and 
emit  infrared  (IR)  radiation  in  a  relatively  smooth  spectrum; 
however,  by  applying  structures  with  resonant  absorption  to 
the  surface  of  those  materials,  the  reflection  and  emission 
spectra  can  be  enhanced  or  reduced  at  particular  wavelengths 
(as  illustrated  in  Figure  1).  Moreover,  by  mixing  small 
resonant  particles  (<100 pm)  designed  for  several  different 
wavelengths,  we  can  create  IR  dust  that  reflects  or  emits 
with  a  characteristic  spectral  signature.  Such  material-by- 
design  particles  would  be  useful  for  a  variety  of  practical 
applications.  For  example,  when  applied  to  a  base  surface, 
the  resonant  particles  could  be  used  to  tune  an  IR  reflectance 
to  mimic  other  natural  or  manmade  surfaces.  This  could  be 
useful  as  a  calibration  standard  for  hyperspectral  imaging 
systems.  Additionally,  if  the  particles  are  chemically  func¬ 
tionalized,  there  are  a  number  of  remote  atmospheric  sensing 
applications  that  could  be  explored. 

2.  Infrared  Absorbers  Using  Plasmonic  Gratings 

It  is  well  known  that  metallic  surfaces  patterned  on  a  sub¬ 
wavelength  scale  exhibit  unusual  electromagnetic  properties 


at  optical  wavelengths.  In  particular,  the  presence  of  localized 
surface  plasmon  resonances  creates  well-defined  absorption 
bands.  This  phenomenon  has  been  studied  and  exploited  by 
a  number  of  investigators  to  realize  new  types  of  sensors, 
optical  filters,  and  absorbers  [  1-5] .  The  goal  of  this  work  was 
to  numerically  and  experimentally  study  plasmonic-based 
resonant  absorbers  in  the  long- wavelength  IR  (LWIR)  band 
(8-14  pm)  that  could  be  fashioned  into  small  (~  100 /mi  x 
100  pm  x  25  pm)  dust  particles.  The  dust  particles  could 
then  be  used  to  tailor  the  reflectivity/emissivity  of  a  surface 
or  dispersed  in  air  and  used  for  atmospheric  sensing 
applications. 

There  are  a  number  of  small  resonant  absorbing  “dust 
like”  structures  that  could  be  used  to  preferentially  absorb, 
and  thus  thermally  emit,  IR  radiation  at  specific  wavelengths 
including  dielectric  ring  resonators,  resonant  patch  antennas, 
and  plasmonic-based  resonator.  These  various  structures 
were  compared  based  on  (1)  their  ability  to  efficiently  absorb 
IR  energy  at  selected  wavelengths  within  the  8-14  pm  band, 
(2)  the  ability  to  easily  tune  the  resonant  absorption,  (3) 
ease  of  fabrication,  and  (4)  manufacturing  cost.  Based  on 
these  criteria,  we  chose  to  investigate,  in  detail,  the  relatively 
simple  surface  plasmon-based  designs  shown  in  Figure  2. 
The  building  blocks  for  this  design  are  two  thin  resonant 
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-  Blackbody 

-  Greybody 

-  Resonant 

Figure  1:  Notional  diagram  that  illustrates  the  normally  smooth 
thermal  exitance  curves  from  blackbody  and  graybody  objects 
compared  to  the  resonant  behavior  of  our  “engineered”  IR  resonant 
dust. 


cavities,  one  on  the  top  of  Figure  2  and  the  other  on  the 
bottom.  Each  cavity  is  composed  of  a  thin  gold  ground 
plane,  a  thin  dielectric  substrate  layer  (formed  from  zinc 
selenide  (ZnSe)  in  our  design),  and  a  subwavelength  metallic 
grating  made  from  gold.  In  the  middle  of  the  structure  is  a 
relatively  thick  silicon  layer  needed  for  mechanical  rigidity. 
The  symmetry  of  the  top  and  bottom  layers  was  needed 
since  the  particles,  when  dispersed,  would  orient  themselves 
randomly. 

The  strong  resonant  behavior  of  this  design  is  due 
to  a  combination  of  two  different  resonant  phenomena. 
The  first  is  a  surface  plasmon  resonance  that  is  excited 
within  the  subwavelength  gold  grating.  The  second  is  a 
cavity  resonance  excited  in  the  ZnSe  substrate  region  that 
is  between  the  grating  layer  and  the  metallic  ground  plane 
layer.  By  adjusting  the  thickness  of  the  ZnSe  substrate  for 
a  given  grating  period  and  duty  cycle,  a  strong  absorption 
resonance  can  be  excited  at  any  wavelength  within  the  8-14 
micron  band.  To  create  small  dust  particles,  a  large  sample  is 
diced  into  small  (~100  pm  X  100  pm  x  25  p m )  particles. 

3.  Computational  Modeling  and  Design 

Two  different  computational  models  were  employed  to 
rigorously  design  and  validate  the  resonant  structure  shown 
in  Figure  2.  The  first  method  is  a  fully  periodic  planar 
method  called  the  rigorous  coupled  wave  method.  The 
second  method,  finite  element  method  (FEM),  was  used  to 
investigate  finite-sized  particle  effects.  A  brief  description 
of  these  two  methods  along  with  simulation  results  are 
presented  in  the  next  two  sections. 

3.1.  Modeling  of  Infinitely  Periodic  Structures  Using  Rigorous 
Coupled  Wave  Analysis.  Two  approaches  are  used  extensively 


for  simulating  the  electromagnetic  properties  of  infinitely 
periodic  subwavelength  gratings.  The  first  uses  effective 
media  theory  to  provide  closed- form  approximations  for 
the  effective  dielectric  constants  as  a  function  of  the  grating 
structure  [6].  Although  attractive  from  a  computational 
perspective,  the  approximate  expressions  are  accurate  only 
for  gratings  whose  period  is  much  smaller  than  the  wave¬ 
length  of  illumination.  As  the  grating  period  approaches  the 
wavelength,  which  is  referred  to  as  the  resonance  regime, 
the  assumptions  on  which  these  closed-form  expressions 
are  based  are  no  longer  valid.  For  our  designs,  we  assumed 
grating  periods  only  slightly  smaller  than  the  material 
wavelength  and  thus  could  not  accurately  utilize  effective 
media  theory. 

We  instead  employed  a  second  approach  using  a  rigorous 
electromagnetic  model.  Although  computationally  more 
difficult,  this  approach  is  capable  of  generating  accurate 
results  for  gratings  of  any  period  size  and  shape.  Several 
rigorous  electromagnetic  models  can  be  used  for  this 
calculation.  We  chose  the  rigorous  coupled  wave  (RCW) 
algorithm  originally  presented  by  Moharam  and  Gaylord 
[7].  Our  specific  implementation  is  based  on  the  enhanced 
transmittance  matrix  approach  introduced  by  Moharam  and 
Gaylord  [7]  and  later  refined  by  Falanne  [8]  and  Noponen 
and  Turunen  [9].  For  the  sake  of  brevity,  we  refer  the  reader 
to  the  references  above  for  details  on  the  RCW  method. 
While  being  accurate,  the  RCW  method  does  assume  the 
grating  structure,  shown  in  Figure  2,  is  infinite  in  the 
transverse  directions.  The  effect  of  finite-sized  samples  is 
investigated  in  Section  3.3. 

3.1.1.  RCW  Simulation  Results  for  Infinitely  Periodic  Surfaces. 
Figure  3  presents  typical  simulation,  results  calculated  using 
the  RCW  method.  In  the  figure,  the  reflectivity  of  the  sample 
is  calculated  as  a  function  of  wavelength  and  polarization  for 
a  normally  Incident  Planewave. 

For  this  simulation  the  ZnSe  substrate  thickness  was 
assumed  to  be  1.8  pm,  the  gold  grating  period  was  3.0  pm 
with  a  50%  duty  cycle.  The  gold  gratings  were  assumed  to 
be  100  nm  thick,  and  the  gold  ground  planes  were  300  nm 
thick.  The  electromagnetic  material  properties  of  the  gold 
were  determined  using  the  model  given  in  [10].  For  the 
ZnSe  layer,  a  lossless  index  of  refraction  of  n  =  2.41  was 
used  in  all  simulations.  The  incident  field  was  assumed 
to  be  normally  incident  from  the  top.  For  this  design,  a 
very  strong  resonance  absorption,  near-perfect  absorption,  is 
predicted  near  9.5  pm  for  the  case  of  parallel  polarization  (E- 
field  polarized  along  the  axis  of  the  grating)  and  only  weak 
resonances  occur  for  the  case  of  perpendicular  polarization 
(E-field  polarized  perpendicular  to  the  axis  of  the  grating). 

3.1.2.  Reflectance  Sensitivity  to  Geometrical  Parameters. 
Given  a  specific  substrate  and  metallization  material,  such  as 
ZnSe  and  gold,  the  dust  particle’s  reflectance  can  be  tuned 
by  proper  selection  of  the  geometrical  parameters  shown 
in  Figure  2:  specifically,  (1)  thickness  of  the  ZnSe  layer, 
denoted  by  h  in  Figure  2;  (2)  grating  period,  denoted  by  A 
in  Figure  2;  (3)  the  grating’s  duty  cycle  given  by  (w/A  in 
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Figure  2:  Illustration  of  our  surface  plasmon-based  IR  resonant  particles.  The  gold  subwavelength  gratings  along  with  cavity  resonances 
produce  distinct  resonant  absorption  phenomenon  within  the  LWIR  band. 
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Figure  3:  Simulation  results,  using  the  RCW  method,  that  present  the  reflectivity  at  normal  incidence  within  the  LWIR  band.  The  reflectivity 
as  expected  is  polarization  dependent  due  to  the  anisotropic  nature  of  the  gratings. 


Figure  2);  (4)  thickness  of  the  gold  grating  layer  and  gold 
ground  plane.  Assuming  the  gold  layers  are  thick  enough  to 
prevent  transmission  (i.e.,  much  thicker  than  the  penetration 
depth),  the  variables  given  by  1-3  above  will  have  the  most 
effect  on  the  LWIR  reflectance. 

In  Figure  4,  we  present  the  effect  of  the  ZnSe  substrate 
thickness  on  the  resonant  behavior.  As  the  thickness  is 
increased  from  1.5  to  2.5  /un,  the  resonant  dip  shifts  from  8.3 
to  13.2  [4m,  respectively.  Thus  the  resonant  behavior  can  be 
tuned  by  simply  varying  the  thickness  of  the  ZnSe  substrate. 

Alternatively,  for  a  given  substrate  thickness,  the  resonant 
absorption  characteristics  can  be  tuned  by  varying  the 
grating  period  and  duty  cycle.  Shown  in  Figure  5  is  the 
simulated  reflectance  of  a  sample  in  which  the  substrate 
thickness  was  fixed  at  2.0  ^m  and  the  grating  period  was 
varied  from  1.0  to  3.0  /un.  For  this  simulation,  the  duty 
cycle  was  fixed  at  50%.  While  the  resonant  wavelength  clearly 
varied  with  grating  period,  the  change  was  less  sensitive  than 
varying  substrate  thickness.  Moreover,  by  just  changing  the 


grating  period,  with  all  other  parameters  fixed,  the  amplitude 
of  the  resonance  would  vary  considerably.  Lastly,  we  varied 
the  grating’s  duty  cycle  while  holding  the  substrate  thickness 
and  grating  period  fixed  at  2.0  and  3.0  ^m,  respectively.  As 
shown  in  Figure  6,  the  grating  duty  cycle  has  a  large  effect  on 
not  only  the  resonant  wavelength  but  also  on  the  amplitude 
and  bandwidth  of  the  resonance. 

The  sensitivity  to  incident  angle  was  also  evaluated  using 
the  RCW  code.  A  typical  result  for  the  case  of  both  parallel 
and  perpendicular  polarization  is  shown  in  Figure  7.  Fiere, 
the  simulation  results  predict  that  the  resonant  frequency 
for  parallel  polarization  (Figure  7(a))  should  slowly  increase 
as  the  incident  angle  increases  from  normal  incidence  (0 
degrees  in  the  figure)  to  near  grazing  angles  (80  degrees).  It 
is  interesting  to  note  that  for  the  case  of  parallel  polarization 
(Figure  7(a))  the  variation  in  resonant  wavelength  with  inci¬ 
dent  angle  is  relatively  small  (<1  /un)  even  with  near-grazing 
incident  angles.  For  the  given  application  of  resonant  dust 
particles,  this  is  an  attractive  feature  since  the  orientation  of 
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-  ZnSe  substrate  thickness  =1.5  microns  ZnSe  substrate  thickness  =  2.25  microns 

-  ZnSe  substrate  thickness  =  1.75  microns  -  ZnSe  substrate  thickness  =  2.5  microns 

-  ZnSe  substrate  thickness  =  2  microns 


Figure  4:  Simulation  results,  using  the  RCW  method,  that  present  the  reflectance  at  normal  incidence  within  the  LWIR  band  as  the  ZnSe 
substrate  thickness  is  varied  from  1.5  to  2.5  pm.  For  this  simulation,  the  grating  period  is  fixed  at  3.0  pm  with  a  50%  duty  cycle.  The  incident 
wave  was  normally  incident  with  parallel  polarization.  As  the  substrate  thickness  is  increased,  the  resonant  absorption  peak  shifts  to  longer 
wavelengths  but  still  remains  strong.  The  bandwidth  of  the  resonance  also  remains  relatively  fixed  as  the  substrate  thicknesses  is  varied. 


Grating  period  =  1  micron 
Grating  period  =1.5  micron 
Grating  period  =  2  micron 
Grating  period  =  2.5  micron 
Grating  period  =  3  micron 


Grating  period  =  3.5  micron 

-  Grating  period  =  4  micron 

-  Grating  period  =  4.5  micron 

-  Grating  period  =  5  micron 


Figure  5:  Simulation  results,  using  the  RCW  method,  that  present  the  reflectance  at  normal  incidence  within  the  LWIR  band  as  the  gold 
grating  period  is  varied  from  1.0  to  5.0  pm.  For  this  simulation,  the  substrate  thickness  is  fixed  at  2.0  pm  with  a  grating  duty  cycle  of  50%. 
The  incident  wave  was  normally  incident  with  parallel  polarization. 
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Figure  6:  Simulation  results  using  the  RCW  method  that  present  the  reflectance  at  normal  incidence  within  the  LWIR  band  as  the  duty  cycle 
of  the  gold  grating  period  is  varied  from  12.5%  to  87.5%.  For  this  simulation,  the  substrate  thickness  is  fixed  at  2.0  pm  with  a  grating  period 
of  3.0  pm.  The  incident  wave  was  normally  incident  with  parallel  polarization. 


Figure  7:  RCW  predictions  illustrating  the  sensitivity  of  our  resonant  structure  with  incident  angle.  The  plot  on  the  left  (a)  is  for  parallel 
polarization,  while  the  plot  on  the  right  (b)  is  for  perpendicular  polarization. 


the  particles  with  respect  to  the  incident  field  cannot  be  well 
controlled. 

3.2.  Iterative  Design.  As  Figures  4  through  7  demonstrate, 
the  resonant  absorption  properties  of  the  structure  shown  in 
Figure  2  have  a  complicated  dependence  on  a  number  of  geo¬ 
metrical  parameters.  As  a  result,  it  is  unlikely  that  any  simple 
analytical  design  equation  could  be  derived  and  used  to 
determine  an  optimal  structure  for  a  given  desired  response. 


Consequently,  we  implemented  a  numerical  iterative  design 
algorithm.  Fiere  the  RCW  method  is  used  to  calculate  the  full 
wave  solution  for  the  reflectance  as  a  function  of  wavelength, 
polarization,  and  angle  of  incidence  for  a  geometry  of  a  given 
substrate  thickness,  grating  period,  and  duty  cycle.  An  opti¬ 
mization  algorithm  is  then  used  to  refine  the  geometry  until 
an  objective  function  is  minimized.  The  objective  function 
may  vary  depending  on  the  application,  but  in  most  cases 
we  chose  to  minimize  the  total  reflectance  over  some  desired 
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wavelength  band.  A  number  of  iterative  optimization  algo¬ 
rithms  could  be  employed  including  traditional  derivative- 
based  algorithms,  genetic  algorithms,  or  direct  pattern  search 
algorithms.  An  advantage  of  both  genetic  and  pattern  search 
algorithms  is  that  they  do  not  require  derivatives,  and  as 
a  consequence  work  well  on  nondifferentiable,  stochastic, 
and  discontinuous  objective  functions.  Both  simple  genetic 
algorithms  and  direct  pattern  search  algorithms  were  imple¬ 
mented  and  tested  for  the  application  of  interest  here.  While 
both  methods  produced  comparable  results,  the  pattern 
search  algorithm  was  often  computationally  less  expensive. 

3.3.  Modeling  of  Finite  Grating  Effects  Using  the  Finite  Element 
Method.  The  RCW  method,  while  accurate  and  compu¬ 
tationally  efficient,  assumes  the  gratings  to  be  infinitely 
periodic.  For  our  application,  the  samples  are  actually 
diced  into  small  (-lOO^m  X  100 ^m  X  25^m)  particles. 
Consequently,  it  is  important  to  understand  the  effects  of 
relatively  small  (<10  wavelengths)  finite-sized  particles  on 
the  overall  effectiveness  of  the  design.  To  conduct  these 
simulations  we  used  the  commercial  EM  solver,  HFSS  from 
Ansys.  Simulations  were  conducted  using  HFSS’s  FEM  solver 
with  grating  structures  that  varied  from  25  to  100  on  a 
side. 

Figure  8  plots  the  simulated  current  density  on  the 
surface  of  a  50  ^m  x  50  ^m  x  5  ^m  plasmonic  particle  at  a 
fixed  incident  wavelength  of  10  ^m.  The  spatial  distribution 
of  current  is  a  direct  consequence  of  its  finite  lateral  size  and 
will  affect  the  total  absorbed  energy.  In  Figure  9,  we  plot 
the  average  reflectance  of  the  same  particle  as  a  function 
of  wavelength.  While  the  total  absorption  is  slightly  less 
and  the  resonance  wavelength  is  slightly  shifted  towards 
longer  wavelength,  the  finite- sized  particles  still  behave  with 
the  same  general  absorption  characteristics  as  the  infinitely 
periodic  predictions  described  previously. 

4.  Experimental  Fabrication 

To  fabricate  the  samples,  a  thin  (80  microns)  2-inch  silicon 
wafer  was  first  mounted  onto  a  3-inch  (350-500  micron) 
silicon  carrier  wafer  using  Aquabond  55  Adhesive  Products 
wax.  The  carrier  wafer  was  placed  on  a  hot  plate  at  a 
temperature  of  80°  C.  A  small  amount  of  wax  was  smeared 
on  the  surface  starting  at  the  center  and  working  outward. 
The  thin  silicon  wafer  was  carefully  placed  on  top  of  the 
wax.  A  flat  glass  plate  was  placed  on  top  of  the  thin  wafer, 
followed  by  a  brass  weight.  This  was  to  keep  the  silicon 
wafer  as  flat  as  possible  during  the  mounting  procedure. 
The  hot  plate  was  turned  off,  and  the  wax  was  allowed  to 
cool  to  room  temperature.  Excess  wax  on  and  around  the 
mounted  silicon  wafer  was  removed  by  gently  swabbing  it 
away  with  a  1%  solution  of  Aqua  Clean.  The  wafer  assembly 
was  placed  in  a  vacuum  electron  beam  evaporator.  A  blanket 
metallization  of  300  A  of  chromium  followed  by  2000  A  of 
gold  was  evaporated  on  to  the  wafer.  The  assembled  structure 
was  then  moved  to  another  vacuum  e-beam  evaporator, 
and  a  1.8-micron-thick  layer  of  ZnSe  was  evaporated  onto 
the  surface.  Depositions  were  performed  at  145° C,  with  a 
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Figure  8:  Current  density  distribution  for  a  finite-sized  resonant 
particle.  Simulations  were  conducted  using  HFSS  FEM  solver. 
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Figure  9:  Predicted  reflectance  curves  for  a  finite-sized  particle 
compared  to  the  infinitely  periodic  calculations.  The  edge  effects  of 
the  finite-sized  sample  are  evident  but  do  not  significantly  alter  the 
resonant  peak. 


base  pressure  of  1  x  10-6.  A  120- A  layer  of  yttrium  oxide 
(Y2O3)  was  deposited  first  to  promote  adhesion  between  the 
substrate  and  the  ZnSe. 

Photolithography  on  the  ZnSe  was  achieved  by  first  spin 
coating  the  wafer  assembly  with  AZ  #  5214  image  reversal 
photoresist  at  a  speed  of  4000  rpm  for  40  seconds.  This 
photoresist  was  hot  plate  baked  at  110°C  for  2  minutes, 
exposed  on  a  JBA  vacuum  contact  aligner  for  20  seconds  with 
a  bulb  intensity  of  4  mW/cm2,  hotplate  baked  (reversal  bake) 
at  124° C  for  40  seconds,  and  flood  exposed  for  25  seconds. 
The  resist  was  then  developed  in  AZ  300  MIF  photoresist 
developer  for  60  seconds  and  rinsed  in  deionized  (DI)  water 
for  1  minute.  The  wafer  was  then  dried  with  nitrogen 
gas.  The  resulting  photolithography  was  inspected  under  a 
microscope  for  clearing.  Prior  to  loading  the  wafer  assembly 
into  the  e-beam  evaporator  for  the  grating  structure,  a 
photoresist  cleaning  in  a  barrel  plasma  asher  was  performed. 
The  patterned  wafer  assembly  was  placed  into  the  vacuum  e- 
beam  evaporator  and  a  metallization  of  300  A  titanium  (Ti) 
followed  by  1000  A  of  gold  was  completed.  A  metal  liftoff 
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using  acetone,  isopropyl,  and  DI  water  removed  the  excess 
metal.  This  fabrication  process  is  graphically  illustrated  in 
Figure  10. 


5.  Experimental  Characterization 

Experimental  characterization  results  for  samples  that  were 
fabricated  using  the  method  described  earlier  are  shown  in 
Figures  11  and  12.  For  these  samples,  the  ZnSe  substrate 
thickness  was  fixed  at  1.8  /un  and  the  linear  gold  gratings 
were  spaced  3.0  with  a  50%  duty  cycle. 

The  IR  reflectance  and  emission  measurements  were 
made  using  a  Nicolet  560  Fourier  transform  infrared  (FTIR) 
spectrometer  with  a  near- normal  incidence  reflectivity  mod¬ 
ule  and  an  input  port  for  collecting  IR  emission  or  photolu¬ 
minescence.  The  reflectivity  was  taken  at  room  temperature 
as  a  function  of  incident  polarization.  Although  the  polarized 
emission  could  be  easily  detected  at  room  temperature,  the 
signal-to-noise  ratio  was  improved  by  taking  the  data  at 
an  elevated  temperature.  The  experimental  results,  which 
closely  match  the  modeled  results,  demonstrate  a  strong 
resonant  absorption  and  thermal  emission  near  the  designed 
wavelength. 

6.  Alternative  Polarization  Insensitive  Designs 

One  disadvantage  of  using  the  resonant  particles  described 
in  Figure  2  is  their  sensitivity  to  polarization.  This  reduces 
the  total  absorbed  energy  by  one  half.  To  address  this  issue, 
we  explored  a  number  of  designs  that  were  less  sensitive 
to  incident  field  polarization.  These  structures,  shown  in 
Figure  13,  consist  of  2D  arrays  of  gold  strips  (known 
commonly  as  a  fishnet  structure),  metallic  patches,  and 
circular  holes.  Each  of  the  structures  shown  in  Figure  11 
was  analyzed  using  the  RCW  method.  Of  those  structures 
analyzed,  the  inductive  grid  array  (Figure  13(c))  showed 
the  most  promise.  Figure  12  presents  numerical  simulations 
of  normal  incident  reflectance  as  a  function  of  wavelength. 
A  strong,  nearly  perfect,  absorption  is  predicted  for  both 
parallel  and  perpendicular  polarization.  Moreover,  as  in  the 
previous  designs,  the  resonant  wavelength  was  easily  tuned 
by  simply  varying  the  thickness  of  the  dielectric  substrate 
layer.  It  should  be  noted  that  the  results  shown  in  Figure  14 
have  not  been  experimentally  validated  yet. 


7.  Conclusions 

In  this  paper,  we  presented  a  design  methodology  to  create 
small  particles  characterized  by  a  strong  resonant  absorption 
within  the  LWIR  (8-14  ^m)  band.  Our  method  combined  a 
surface  plasmon  resonance,  created  using  a  subwavelength 
metallic  grating  with  a  dielectric  cavity  resonance.  We 
showed  that  by  varying  the  thickness  of  the  cavity  substrate 
the  resonances  could  be  tuned  anywhere  within  the 
LWIR  band.  Experimental  samples  were  fabricated  using 
photolithography  and  experimentally  characterized.  The 
experimental  results  compared  favorably  with  the  calculated 
results.  We  believe  that  material-by- design  particles,  such 
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Figure  10:  Illustration  of  the  fabrication  steps. 


-  Experiment  (P)  -  Calculation  (P) 

- Experiment  (S)  - Calculation  (S) 

Figure  11:  A  comparison  of  predicted  (using  RCW  code)  and 
measured  reflectance  for  both  parallel  (P  type)  and  perpendicular 
(S  type)  polarizations. 


-  Reflectance  (P) 

-  Emission 


Figure  12:  Experimentally  measured  reflectance  curve  and  emis¬ 
sion  curve  clearly  demonstrating  the  resonant  nature  of  our  design. 
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Figure  13:  Designs  less  sensitive  to  polarization  effects. 


Substrate  thickness 

-  1.75  microns  -  2.25  microns 

-  2  microns  -  2.5  microns 


Figure  14:  RCW  simulations  for  inductive  grid  array  shown  in  Figure  1 1(c).  Here  as  the  substrate,  assumed  to  be  ZnSe,  is  varied  from  1.75 
to  2.5  pm.  The  resonant  absorption  wavelength  shifts  to  longer  wavelengths;  however,  the  magnitude  of  the  absorption  remains  near  perfect. 


as  the  ones  described  here,  would  be  useful  for  a  variety 
of  remote  atmospheric  sensing  applications.  In  those 
applications,  which  require  relatively  small  particles,  a 
custom  spectral  signature  with  multiple  wavelengths  would 
be  achieved  by  mixing  batches  of  single-wavelength  particles 
designed  for  the  component  wavelengths.  But  in  other 
applications,  such  as  calibrated  surfaces  for  hyperspectral 
imager  testing  and  training,  the  surfaces  could  be  larger 
and  the  multiple  wavelengths  could  be  designed  into  a 
single  surface  by  implementing  a  checkerboard  subcells  with 
different  grating  periods  across  the  surface.  By  properly 
selecting  the  frequencies  and  relative  areas  of  the  emitting 
subcells,  the  emission  spectrum  could  be  designed  to  mimic 
the  spectral  emission  from  specific  natural  surfaces. 
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With  the  prevalence  of  surveillance  systems,  face  recognition  is  crucial  to  aiding  the  law  enforcement  com¬ 
munity  and  homeland  security  in  identifying  suspects  and  suspicious  individuals  on  watch  lists.  However, 
face  recognition  performance  is  severely  affected  by  the  low  face  resolution  of  individuals  in  typical  sur¬ 
veillance  footage,  oftentimes  due  to  the  distance  of  individuals  from  the  cameras  as  well  as  the  small  pixel 
count  of  low-cost  surveillance  systems.  Superresolution  image  reconstruction  has  the  potential  to  improve 
face  recognition  performance  by  using  a  sequence  of  low-resolution  images  of  an  individual’s  face  in  the 
same  pose  to  reconstruct  a  more  detailed  high-resolution  facial  image.  This  work  conducts  an  extensive 
performance  evaluation  of  superresolution  for  a  face  recognition  algorithm  using  a  methodology  and  ex¬ 
perimental  setup  consistent  with  real  world  settings  at  multiple  subject-to-camera  distances.  Results  show 
that  superresolution  image  reconstruction  improves  face  recognition  performance  considerably  at  the 
examined  midrange  and  close  range. 

OCIS  codes:  100.0100,  100.6640,  100.4995,  100.2980. 


1 .  Introduction 

The  affordability  of  surveillance  systems  has  led  to 
their  widespread  usage  on  commercial  properties 
and  for  residential  monitoring.  Consequently  video 
footage  of  criminal  activity  is  often  available  to 
law  enforcement  to  help  identify  suspects.  Therefore, 
face  recognition  software  is  a  crucial  tool  that  the  law 
enforcement  community  may  use  to  search  watch 
lists  and  criminal  databases  to  identify  the  suspect 
acquired  on  video.  However,  typical  low-cost  surveil¬ 
lance  systems  have  small  pixel  counts.  Furthermore, 
the  suspect  could  be  far  away  from  the  camera,  re¬ 
sulting  in  images  with  very  limited  number  of  pixels 
on  the  face  (i.e.,  low  face  resolution). 

Research  studies  have  shown  that  while  face  recog¬ 
nition  algorithm  performance  is  dependent  on  face 
resolution,  this  dependence  is  highly  nonlinear.  Boom 
et  al.  [1]  examined  the  effect  of  resolution  on  face  re¬ 
cognition  and  observed  that  performance  became  se¬ 
verely  degraded  for  face  images  with  sizes  less  than 
32  x  32  pixels.  However,  performance  was  observed 
to  be  fairly  similar  for  face  images  with  sizes  ranging 


from  32  x  32  pixels  to  128  x  128  pixels  [1],  substan¬ 
tiating  the  highly  nonlinear  nature  of  face  recogni¬ 
tion  performance  with  respect  to  face  resolution.  The 
Facial  Recognition  Vendor  Test  2000  [2]  also  observed 
that  the  evaluated  face  recognition  systems  yielded 
similar  performance  for  face  images  with  face  resolu¬ 
tions  of  30  to  60  pixels  measured  in  terms  of  eye-to-eye 
distance,  but  that  performance  severely  degraded  for 
some  algorithms  at  an  eye-to-eye  distance  of  15  pixels. 
In  the  authors’  experience  of  working  with  law 
enforcement  agencies,  it  is  not  uncommon  for  faces 
in  typical  surveillance  footage  from  residential  and 
commercial  properties  to  have  resolutions  less  than 
30  pixels  in  terms  of  eye-to-eye  distance,  especially 
when  the  suspect  is  far  away  from  the  camera.  There¬ 
fore,  the  limited  face  resolution  within  surveillance 
footage  is  a  major  obstacle  for  face  recognition  soft¬ 
ware.  Pennsylvania  Justice  Network  (JNET)  states 
that  low  resolution  and  distance  are  two  of  the  main 
factors  that  limited  face  recognition  effectiveness  of  its 
statewide  implementation  of  a  facial  recognition 
search  system  for  investigators  [3] . 
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Face  recognition  continues  to  be  an  active  area  of 
research  focused  on  improving  performance  through 
the  development  of  new  feature  transforms,  classifi¬ 
cation  techniques,  and  mathematical  frameworks  to 
handle  the  large  variability  of  face  imagery  found  in 
real  life.  Many  factors  contribute  to  this  variability: 
illumination,  pose,  and  scale/resolution  are  several  of 
the  main  factors.  While  research  has  been  predomi¬ 
nantly  focused  on  solving  the  pose  and  illumination 
challenges  for  face  recognition,  some  efforts  have 
been  devoted  to  solving  the  face  resolution  problem 
through  the  use  of  superresolution  image  reconstruc¬ 
tion,  which  utilizes  a  sequence  of  low-resolution  (LR) 
images  containing  the  face  in  the  same  pose  to  recon¬ 
struct  a  high-resolution  face  image  with  more  de¬ 
tails.  Boult  et  al.  [4]  proposed  a  superresolution 
method  via  image  warping  for  face  recognition,  and 
Baker  and  Kanade  [5]  proposed  hallucinating  faces 
through  a  Gaussian  pyramid-based  method;  how¬ 
ever,  these  works  did  not  conduct  a  performance  eva¬ 
luation  to  assess  the  benefit  of  superresolution  for 
face  recognition.  More  recently,  Wheeler  et  al.  [6]  de¬ 
veloped  a  multiframe  face  superresolution  method 
with  an  active  appearance  model  for  registration 
and  evaluated  the  face  recognition  improvement 
using  the  Identix  Facelt  software.  However,  only  138 
images  (split  between  six  ranges  in  terms  of  eye-to- 
eye  distance)  from  three  test  subjects  were  used  in 
[6].  Due  to  the  small  sample  size,  the  observed  im¬ 
provement  with  superresolution  is  unlikely  to  be 
statistically  meaningful  in  [6].  Whereas  [4-6]  per¬ 
form  superresolution  image  enhancement  in  the 
pixel  domain  prior  to  the  face  recognition  algorithm, 
Gunturk  et  al.  [7]  developed  an  eigenface-domain 
superresolution  technique  for  face  recognition.  The 
algorithm  of  [7]  performs  superresolution  recon¬ 
struction  in  a  low-dimensional  face  space  through 
principal  component  analysis  (PCA)-based  dimen¬ 
sionality  reduction  and  showed  an  improvement  in 
face  recognition  performance  with  a  minimum 
distance  classifier  in  the  eigenspace.  In  contrast 
to  [4-7],  Hennings-Yeomans  et  al.  [8]  proposed  an 
algorithm  incorporating  face  features  into  super¬ 
resolution  as  prior  information,  and  Huang  et  al.  [9] 
developed  a  superresolution  approach  based  on 
correlated  features  and  nonlinear  mappings  be¬ 
tween  low-resolution  and  high-resolution  features. 
Fookes  et  al.  [10]  conducted  the  most  recent  work 
on  superresolution  for  face  recognition,  evaluating 
the  performance  of  two  face  recognition  algorithms 
with  three  superresolution  techniques.  However,  as 
in  [7-9],  Fookes  et  al.  [10]  also  utilized  synthetically 
generated  LR  face  images  by  downsampling  the 
original  high-resolution  imagery.  Downsampled 
face  imagery  does  not  accurately  depict  real-world 
compressed  surveillance  face  images  at  varying 
subject-to-camera  distances,  especially  since  com¬ 
pression  is  highly  nonlinear  with  more  pronounced 
effects  on  facial  details  for  far  subject-to-camera 
ranges.  Although  [10]  also  used  a  white  Gaussian 
noise  corrupted  version  of  the  downsampled  sets, 


the  added  white  Gaussian  noise  does  not  resemble 
compression  artifacts.  The  goal  of  this  work  is  to  con¬ 
duct  a  comprehensive  performance  assessment  of  a 
state  of  the  art  baseline  face  recognition  algorithm 
[11,12]  with  the  pixel-level  superresolution  method 
of  Young  et  al.  [13]  using  a  large  database  containing 
videos  similar  to  real-world  surveillance  footage. 

Specifically,  the  objectives  of  this  work  are  to 

(a)  assess  the  benefit  of  superresolution  for  face  re¬ 
cognition  with  respect  to  subject-to-camera  range, 

(b)  assess  face  recognition  performance  using  super- 
resolved  imagery  reconstructed  using  varying  num¬ 
bers  of  LR  frames,  and  (c)  evaluate  face  recognition 
performance  of  individual  frames  within  the  LR 
sequence  as  well  as  the  performance  of  a  decision 
level  fusion  of  the  sequence  to  compare  with  super¬ 
resolution  face  recognition  results.  The  database  of 
moving  faces  and  people  acquired  by  O’Toole  et  al. 
[14]  was  used  for  this  study,  specifically  the  parallel 
gait  video  datasets  and  close-up  mug  shots.  Face  re¬ 
cognition  performance  with  the  LR  and  super- 
resolved  imagery  was  assessed  with  the  local  region 
principal  component  analysis  (LRPCA)  face  recog¬ 
nition  algorithm  [11,12]  developed  at  Colorado 
State  University.  Correct  verification  rates  are  cal¬ 
culated  and  compared  at  three  face  resolutions/ 
scales  in  terms  of  eye-to-eye  distance  corresponding 
to  different  subject-to-camera  distances  within  the 
video  footage.  Results  show  that  superresolution 
image  reconstruction  significantly  improves  face 
recognition  verification  rates  at  the  examined  mid- 
and  close  ranges,  with  some  improvement  at  the 
far  range. 

2.  Methodology 

A.  Database 

Parallel  gait  videos  and  static  mug  shot  images  from 
the  video  database  of  moving  face  and  people  [14]  are 
used  for  this  work.  The  parallel  gait  video  shows  the 
subject  moving  towards  the  camera  from  13.6  m 
away  to  approximately  1.5  m  away,  providing  a  large 
sequence  of  face  imagery  at  different  face  resolutions 
from  which  query  sets  can  be  formed.  A  sample  frame 
containing  the  subject  at  the  far  range  is  shown  in 
Fig.  1.  Since  faces  in  the  parallel  gait  videos  are  ac¬ 
quired  from  the  frontal  perspective,  the  correspond¬ 
ing  frontal  mug  shots  are  used  to  form  the  gallery  set. 
The  resolution  of  the  videos  as  well  as  of  the  frontal 
mug  shot  is  720  x  480  pixels  (note  that  the  corre¬ 
sponding  pixel  count  is  345,600  pixels,  substantially 
less  than  even  one  megapixel).  The  videos  were  ac¬ 
quired  with  compression  using  a  Canon  Optura  Pi 
digital  video  camera.  Figure  2  shows  (a)  close  range 
face  image  of  a  subject,  (b)  close  range  face  image 
downsampled  by  a  factor  of  3  to  simulate  far  range 
using  procedure  of  [10] ,  and  (c)  far  range  face  image 
of  the  subject  taken  from  the  same  video.  The  down- 
sampling  procedure  of  [10]  involved  convolving  the 
close  range  face  image  with  a  Gaussian  filter  of 
d/A  and  then  downsampling  by  rf,  where  d  is  the 
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Fig.  1.  (Color  online)  Sample  frame  extracted  from  a  subject’s 
parallel  gait  video  in  the  database  of  moving  faces  and  people 
[14].  Subject  is  at  the  far  range  (resulting  eye-to-eye  distance  of 
5-10  pixels). 


downsampling  factor.  Note  that  while  the  compres¬ 
sion  artifacts  are  not  obvious  in  the  close  range  face 
image  (simply  because  facial  features  consist  of 
many  pixels),  the  compression  distortions  are  highly 
pronounced  in  the  actual  far  range  face  image  taken 
from  the  same  video.  Simulating  the  far  range  face 
image  by  downsampling,  as  past  studies  have  done 
in  examining  superresolution  for  face  recognition, 
does  not  closely  resemble  actual  far  range  face  ima¬ 
gery  due  to  the  effects  of  compression.  The  parallel 
gait  videos  used  in  this  work  emulate  real-world 
compressed  surveillance  footage  and  enable  a  realis¬ 
tic  assessment  of  superresolution  benefit  for  face 
recognition. 

B.  Superresolution 

This  study  used  the  reconstruction-based  super¬ 
resolution  algorithm  of  Young  and  Driggers  [13], 
which  utilizes  a  series  of  undersampled/aliased  LR 
images  to  reconstruct  an  alias-free  high-resolution 
image.  This  reconstruction-based  superresolution 


(a)  Close  range  face  image  (b)  Simulated  far 

range  image  by 
downsampling 


(c)  Far  range 
face  image 


Fig.  2.  (a)  Close  range  face  image  of  a  subject,  (b)  close  range  face 

image  downsampled  by  a  factor  of  3  to  simulate  far  range  using 
procedure  of  Fookes  et  al.  [10],  and  (c)  far  range  face  image  of 
the  subject  taken  from  the  same  video. 


algorithm  consists  of  a  registration  stage  and  a 
reconstruction  stage.  The  registration  stage  com¬ 
putes  the  gross  shift  and  subpixel  shift  of  each  frame 
in  the  sequence  with  respect  to  the  reference  frame 
using  the  correlation  method  in  the  frequency  do¬ 
main.  The  reconstruction  stage  uses  the  error-energy 
reduction  method  with  constraints  in  both  spatial  and 
frequency  domains,  generating  a  superresolved  im¬ 
age  that  improves  the  high-frequency  content  that 
was  lost  or  corrupted  due  to  the  undersampling/ 
aliasing  of  the  sensor.  The  resolution  improvement 
factor  of  the  superresolved  image  is  the  square  root 
of  the  number  of  frames  used  to  reconstruct  the  super¬ 
resolved  image.  A  necessary  condition  for  superreso¬ 
lution  benefit  is  the  presence  of  different  subpixels 
shifts  between  frames  to  provide  distinct  information 
from  which  to  reconstruct  a  high-resolution  image. 
The  natural  movement  of  the  subject  in  the  parallel 
gait  video  provided  this  necessary  subpixel  shift. 

C.  Query  Sets 

Frame  sequences  at  three  different  subject-to- 
camera  distances  are  extracted  from  each  subject’s 
parallel  gait  video:  far  range  (~13  m),  midrange 
(~9  m),  and  close  range  (~5  m).  The  face  resolutions 
(in  terms  of  eye-to-eye  distances)  corresponding  to 
the  far,  mid-,  and  close  ranges  are  5-10,  15-20, 
and  25-30  pixels,  respectively.  Three  query  sets 
are  constructed  for  each  range:  (a)  original  LR  ima¬ 
gery  (taken  as  the  first  frame  within  the  sequence), 
(b)  superresolved  imagery  using  four  consecutive  LR 
frames  (SR4),  and  (c)  superresolved  imagery  using 
eight  consecutive  LR  frames  (SR8).  SR4  and  SR8  en¬ 
able  an  assessment  of  the  impact  of  the  number  of 
frames  used  for  superresolution  on  face  recognition 
performance.  The  resolution  improvement  factor  in 
the  *  and  y  directions  is  2  and  2.8  for  SR4  and 
SR8,  respectively.  Consequently,  the  size  of  the 
SR4  face  image  is  a  factor  of  2  larger  in  the  x  and 
y  dimensions  than  the  corresponding  LR  face  image; 
the  size  of  the  SR8  face  image  is  a  factor  of  2.8  larger 
in  the  x  and  y  dimensions  than  the  LR  face  image. 
A  total  of  nine  different  query  sets  (Table  1)  are  gen¬ 
erated  to  evaluate  the  improvement  in  face  recogni¬ 
tion  with  superresolution;  each  query  set  contains 
200  subjects  with  one  image  per  subject. 

D.  Face  Recognition 

This  study  used  the  state-of-the-art  baseline  LRPCA 
face  recognition  algorithm  developed  by  Bolme  et  al. 


Table  1.  Query  Set  Nomenclature9 


5-10  Pixels 

15-20  Pixels 

25-30  Pixels 

Low-resolution 

LR5-10 

LR15-20 

LR25-30 

Superresolved 

4  frames 

SR45_10 

SR4i5_20 

SR425_3o 

Superresolved 

8  frames 

SR85_10 

SR815_2o 

SR825_3o 

“Top  row  represents  subject-to-camera  range  in  terms  of  eye- 
to-eye  distance. 
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[11  ?  12] .  The  high-frequency  content  recovered  in  the 
superresolved  imagery  is  expected  to  aid  principal 
component  analysis  (PCA)-based  methods,  since  cur¬ 
rent  PCA-based  algorithms  often  employ  a  large 
number  of  basis  vectors  (on  the  order  of  thousands 
for  this  study).  As  a  preprocessing  step,  all  query 
and  gallery  images  are  cropped  and  normalized  to 
256  x  256  pixels  through  bilinear  interpolation  using 
manually  defined  eye  coordinates.  The  LRPCA  algo¬ 
rithm  was  trained  using  the  “The  Good,  The  Bad,  and 
The  Ugly”  (GBU)  subset  of  the  Multiple  Biometric 
Grand  Challenge,  containing  a  total  of  522  subjects. 
Training  on  a  separate  dataset  distinct  from  the 
query  and  gallery  sets  avoids  biasing  the  perfor¬ 
mance  of  the  algorithm.  The  gallery  set  correspond¬ 
ing  to  each  query  set  consists  of  one  frontal  mug  shot 
for  each  subject. 

E.  Performance  Measurement 

For  each  query  set  and  gallery,  the  output  of  the 
LRPCA  face  recognition  algorithm  is  a  similarity 
matrix  S  containing  the  similarity  measure  between 
every  probe  in  the  query  set  and  every  gallery  image. 
Note  that  for  this  work,  both  the  query  and  gallery 
sets  contain  a  single  image  of  each  subject  (N  =  200 
subjects  total);  therefore,  the  similarity  matrix  is  a 
N  x  N  square  matrix  with  the  diagonal  elements  con¬ 
taining  the  N  match  scores  and  N  (N  -  1)  off-diagonal 
elements  containing  the  nonmatch  scores. 

1.  Receiver  Operating  Characteristic  Curves 

The  similarity  matrix  is  used  to  compute  the  correct 
verification  rates  as  well  as  the  corresponding  false 
accept  rates  (FARs).  In  the  verification  model,  the 
face  recognition  system  is  tasked  with  deciding 
whether  the  person  in  the  probe  image  pt  is  the  same 
as  the  person  in  the  gallery  imagery  gj  [15] .  The  de¬ 
cision  is  made  based  on  the  Neyman-Pearson  theo¬ 
rem,  testing  whether  the  similarity  score  between  pL 
and  gj  exceeds  a  given  threshold  t0.  The  correct  ver¬ 
ification  rate  is  computed  by  tallying  the  number  of 
diagonal  elements  (match  scores)  that  exceed  t0,  and 
the  FAR  is  computed  by  tallying  the  number  of  off- 
diagonal  elements  (nonmatch  scores)  that  exceed  t0 
[15].  Receiver  operating  characteristic  (ROC)  curves 
were  generated  by  thresholding  the  similarity  ma¬ 
trix  S  at  various  thresholds  across  the  range  from 
Smin  to  Smax.  For  each  of  the  nine  query  sets  listed 
in  Table  1,  a  ROC  curve  was  constructed  in  this 
manner. 

2.  Performance  with  Respect  to  Range 

To  visualize  face  recognition  performance  with  respect 
to  subject-to-camera  range,  the  correct  verification 
rates  are  plotted  with  respect  to  range  at  commonly 
used  FARs  of  0.01  and  0.05  for  LR,  SR4,  and  SR8.  Con¬ 
fidence  intervals  are  calculated  and  overlaid  onto  the 
plots  to  assess  the  statistical  reliability  of  the  perfor¬ 
mance  improvement  achieved  with  superresolution 
image  reconstruction. 


3.  Confidence  Intervals 

To  indicate  the  reliability  of  the  calculated  correct 
verification  rates,  95%  confidence  intervals  are  de¬ 
termined  using  the  bootstrap  method,  specifically 
following  the  procedure  for  biometrics  detailed  in 
[16] .  The  bootstrap  is  a  nonparametric  approach  that 
makes  no  assumptions  about  the  error  distribution 
and  is  preferable  to  parametric  techniques  when 
the  underlying  distribution  is  unknown,  as  is  the 
case  for  biometrics.  Bootstrap  involves  resampling 
the  available  data  (match  scores  for  this  study)  many 
times  with  replacement  to  generate  confidence  inter¬ 
vals.  For  this  work,  the  probe  set  contained  one  im¬ 
age  per  subject  and  the  gallery  contained  one  image 
per  subject  for  LR,  SR4,  and  SR8  at  each  range,  sa¬ 
tisfying  the  independent  and  identically  distributed 
( i.i.d .)  requirement  of  the  bootstrap. 

Recall  that  the  output  of  the  LRPCA  algorithm  is  a 
similarity  matrix  S  containing  scores  of  the  similar¬ 
ity  between  a  probe  and  all  gallery  images.  Also  re¬ 
call  that  S  contains  M  =  N  match  scores  along  the 
diagonal  and  N(N  -  1)  mismatch  scores,  where  N  = 
200  is  the  number  of  subjects.  For  a  given  £0,  let  the 
verification  rate  estimate  be  defined  by  the  equation 


i—1 


where  X  denotes  the  set  of  M  match  scores  and  1  is 
the  indicator  function  [16] .  The  bootstrap  generates 
X*  =  [X'f  ...,X^}  by  resampling  with  replacement, 

and  then  calculates  F*(t0).  This  resampling  proce¬ 
dure  is  repeated  B  times  (B  =  10, 000  for  this  work), 
generating  bootstrap  estimates  F*  =  (F{,Fl,  ...,F^). 
The  lower  and  upper  bounds  of  the  95%  confidence 
interval  is  determined  as  values  corresponding  to 
the  2.5th  and  97.5th  percentile  of  the  histogram  of 

the  B  bootstrap  estimates  F*. 

4.  Face  Recognition  Performance  of  Individual 
Frames  and  Decision  Level  Fusion 
To  address  the  question  of  how  face  recognition  per¬ 
formance  with  superresolution  compares  to  face  re¬ 
cognition  performance  of  individual  frames  within 
the  LR  sequence  as  well  as  to  the  performance  of  a 
decision  level  fusion  scheme,  further  analysis  was 
conducted.  Superresolution  exploits  the  additional 
spatial  information  contained  in  the  temporal  di¬ 
mension  (i.e.,  multiple  frames)  to  reconstruct  a  more 
detailed  face  image  for  recognition,  and  therefore  is 
expected  to  exceed  the  face  recognition  performance 
of  any  single  frame  within  the  LR  sequence.  To  vali¬ 
date  this  expectation,  face  recognition  performance 
was  computed  for  each  of  the  eight  LR  frames  used 
to  reconstruct  SR8  and  compared  to  face  recognition 
performance  of  SR8.  Furthermore,  a  simple  fusion 
scheme  for  the  LR  frame  sequence  was  implemented 
by  averaging  the  similarity  matrices  from  the 
eight  LR  frames.  Fusion  by  averaging  of  similarity 
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matrices  exploits  the  spatial  information  in  the  tem¬ 
poral  domain  at  the  decision  level  and  is  expected  to 
be  an  upper  bound  on  the  face  recognition  perfor¬ 
mance  of  any  individual  frame.  Face  recognition 
performance  with  superresolved  imagery  is  then 
compared  to  the  performance  of  this  decision  level 
fusion  scheme. 

A  total  of  24  query  sets  (eight  query  sets  per  range 
corresponding  to  each  of  the  eight  frames  in  the  LR 
sequence  used  to  reconstruct  SR8;  Table  2)  was  gen¬ 
erated  to  assess  the  variation  in  face  recognition  per¬ 
formance  with  respect  to  individual  frames.  Note 
that  the  LR  sequence  is  an  eight  frame  clip  of  the 
subject  walking  towards  the  camera.  Due  to  the  fast 
frame  rate  (30  Hz)  and  relatively  slow  speed  of  the 
subjects  (walking  speed),  the  change  in  pose  is  insig¬ 
nificant  across  the  eight  frames  in  the  sequence.  At 
the  far  range,  since  the  change  in  face  size  across  the 
eight  frames  does  not  exceed  a  single  pixel,  the  same 
eye  coordinates  in  terms  of  ( x,y )  pixel  locations  are 
used  for  all  eight  frames.  At  the  mid-  and  close 
ranges,  face  size  does  enlarge  by  a  few  pixels  across 
the  frames;  therefore,  eye  coordinates  are  manually 
picked  for  all  eight  frames  instead  of  for  just  the  first 
frame  as  in  the  far  range.  Once  the  similarity  matrix 
for  each  query  set  is  computed  with  the  LRPCA  algo¬ 
rithm,  ROC  curves  of  face  recognition  performance 
with  respect  to  individual  frames  can  be  generated. 
The  decision  level  fusion  method  (denoted  LRave) 
averages  the  similarity  matrices  across  the  eight 
frames  at  each  range  and  generates  the  ROC  curve 
using  the  averaged  similarity  matrix  for  comparison 
with  face  recognition  using  superresolved  imagery. 

3.  Results  and  Discussion 

A.  Superresolved  Imagery 

Superresolved  face  imagery  and  original  low  resolu¬ 
tion  face  imagery  are  shown  in  Fig.  3  at  different 
ranges.  At  the  far  range,  the  LR  image  is  heavily 
pixilated  and  distorted  by  compression,  yielding  a 
coarse  facial  outline  and  few  facial  features.  Super¬ 
resolution  with  four  and  eight  frames  enhances  the 
facial  outline  and  some  facial  details,  but  compres¬ 
sion  artifacts  have  almost  completely  eliminated 
facial  details  in  the  low  resolution  frames,  preventing 
significant  facial  feature  enhancement. 

As  range  decreases,  the  camera  captures  finer  de¬ 
tails  and  the  detrimental  impact  of  compression  on 
facial  features  lessens  because  the  size  of  these  fea¬ 
tures  is  now  larger.  At  the  midrange,  SR4  and  SR8 
produce  considerable  enhancement  of  the  subject’s 
facial  details.  As  range  continues  to  decrease  to 


LR  SR4  SRS 


Fig.  3.  Low-resolution  (LR)  imagery  and  superresolved  imagery 
(4  frames — SR4,  8  frames — SR8)  at  eye-to-eye  distances  of 
5-10,  15-20,  and  25-30  pixels.  All  images  at  all  ranges  have  been 
resized  to  a  fixed  size  for  comparison. 

the  close  range,  superresolution  benefit  decreases 
as  facial  features  become  more  and  more  defined  in 
the  low  resolution  imagery.  Although  the  close  range 
SR  images  may  not  appear  significantly  enhanced  vi¬ 
sually,  facial  recognition  algorithms  may  still  benefit 
from  superresolution  as  these  algorithms  operate  on 
different  principles  than  the  human  visual  system. 

To  provide  a  more  objective  assessment  of  the  in¬ 
crease  in  high-frequency  content  with  superresolu¬ 
tion,  spectral  analysis  was  conducted  using  LR  and 
superresolved  face  imagery.  Let  the  following  equa¬ 
tions  denote  the  cumulative  power  spectrum  in 
wavenumber  kx  and  ky,  respectively,  where  F(kx,ky) 
is  the  Fourier  transform  of  the  considered  image: 


SM  =  J2\F(kx,ky)\2, 

ky 

(2) 

S2{ky)  =  Y}F(K,ky)\2. 

(3) 

K 


For  this  study,  the  cumulative  power  spectrum  is  com¬ 
puted  over  the  spatial  region  consisting  of  the  eyes, 
which  is  a  critical  area  for  face  recognition  algorithms. 
Figure  4  shows  the  computed  -domain  spectrums 
for  LR  and  SR8  eye  region  at  the  midrange.  The 
circled  part  of  the  plot  in  Fig.  4  represents  the 


Table  2.  Query  Set  Nomenclature  for  Evaluation  of  Face  Recognition  with  Respect  to  Low-Resolution  Frame3 


Frame  1 

Frame  2 

Frame  3 

Frame  4 

Frame  5 

Frame  6 

Frame  7 

Frame  8 

Far  range 

LR5-10 

LRf-io 

LR5-10 

LR5-10 

LR5-10 

LR5-10 

LR5-10 

LRl_10 

Midrange 

T  R1 

■LjI',15-20 

LRl5-20 

LR15-20 

LR15-20 

LR15-20 

LR15-20 

T  R7 

LRl5-20 

Close  range 

■^25-30 

LRls-30 

LR25-30 

LR25-30 

LR25-30 

LR25-30 

■^25-30 

1^25-30 

“Superscript  denotes  frame  number,  while  subscript  denotes  range  in  terms  of  eye-to-eye  distance  in  pixels. 
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Fig.  4.  (Color  online)  Computed  ^-domain  cumulative  power 
spectrums  of  LR  eye  region  and  SR8  eye  region  at  the  midrange. 
The  circled  part  of  the  plot  represents  high-frequency  band  recov¬ 
ered  from  using  a  sequence  of  eight  aliased  low-resolution  frames. 

high-frequency  band  recovered  with  superresolution 
using  a  sequence  of  eight  aliased  LR  images. 

Furthermore,  the  enhancement  in  edge  contrast  is 
demonstrated  in  Fig.  5,  which  plots  the  intensity  va¬ 
lues  of  LR  and  SR8  at  the  midrange  along  a  horizon¬ 
tal  profile  across  the  eyes.  The  improvement  in  edge 
contrast  is  especially  noticeable  across  the  pupils 
(horizontal  axis  =  ±10)  in  Fig.  5. 

B.  Receiver  Operating  Characteristic  Curves 

ROC  curves  at  the  5-10,  15-20,  and  25-30  pixel 
eye-to-eye  distance  are  shown  in  Figs.  6,  7,  and  8, 
respectively.  Each  figure  contains  three  ROC 
curves  corresponding  to  the  LR  (red  dotted  line), 


Performance  at  5-10  Pixels  Eye-to-Eye  Distance 


False  Accept  Rate 


Fig.  6.  (Color  online)  ROC  curves  at  the  far  range  for  original  low 
resolution  (LR5_10)  query  set  and  the  corresponding  superresolved 
query  sets  using  four  (SR45_10)  and  eight  (SR85_10)  frames. 

superresolved  using  four  frames  (SR4;  dashed  green 
line),  and  superresolved  using  eight  frames  (SR8; 
solid  blue  line)  imagery.  At  the  far  range,  the  ROC 
curves  for  SR45_10  and  SR85_10  lay  slightly  but  con¬ 
sistently  above  the  ROC  curve  for  LR5_10,  suggesting 
that  face  imagery  at  the  far  range  possessed  too  few 
details  for  superresolution  to  provide  any  substantial 
enhancement  to  aid  the  LRPCA  face  recognition  al¬ 
gorithm.  At  the  midrange,  SR815_2o  outperformed 
SR415_2o,  which  in  turn  outperformed  LR15_20  across 
the  FARs  from  FAR  =  0.001  to  0.6;  superresolution 
effectively  enhances  facial  details  at  the  midrange 
to  yield  a  large  improvement  in  face  recognition  per¬ 
formance  over  the  baseline  LR  imagery.  At  the  close 
range,  while  both  SR425_30  and  SR825_30  produced 
higher  face  recognition  performance  than  LR25_30 
at  all  FARs,  the  improvement  is  not  as  large  as 


Fig.  5.  (Color  online)  Pixel  intensity  value  plots  of  LR  and  SR8 
along  a  profile  across  the  eye  region  at  the  midrange,  showing 
the  improved  edge  contrast  with  SR8  in  the  spatial  domain. 


Performance  at  15-20  Pixels  Eye-to-Eye  Distance 


Fig.  7.  (Color  online)  ROC  curves  at  the  midrange  for  original  low 
resolution  (LR15_2o)  query  set  and  the  corresponding  superresolved 
query  sets  using  four  (SR415_2o)  and  eight  (SR815_20)  frames. 
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Performance  at  25-30  Pixels  Eye-to-Eye  Distance 


Fig.  8.  (Color  online)  ROC  curves  at  the  close-range  for  original 
low  resolution  (LR25_3o)  query  set  and  the  corresponding  super- 
resolved  query  sets  using  four  (SR425_3o)  and  eight  (SR825_30) 
frames. 


achieved  at  the  midrange  since  the  original  imagery 
already  contains  detailed  facial  features. 

C.  Performance  with  Respect  to  Range 

For  practical  applications,  performance  at  low  FARs 
is  of  particular  interest;  therefore,  in  verification  rate 
as  a  function  of  range  is  examined  at  FAR  =  0.01  and 
0.05  in  Fig.  9.  At  all  FARs,  the  LR  curve  exhibits  a 
slight  knee  at  the  midrange;  the  knee  is  more  pro¬ 
nounced  for  the  SR4  and  SR8  curves,  signifying  that 
the  change  in  performance  with  respect  to  range  is 
more  nonlinear  for  SR  imagery. 

At  the  far  range,  the  already  limited  facial  details 
are  distorted  by  compression,  preventing  substantial 
enhancement  by  superresolution  image  reconstruc¬ 
tion.  At  the  midrange  where  the  greatest  benefit 
from  superresolution  is  observed,  the  correct  verifi¬ 
cation  rate  is  21.0%  for  SR415_20  and  27.0%  for 
SR815_2o  compared  to  14.5%  for  LR15_2o,  resulting 
in  an  improvement  of  44.8%  and  86.2%  at 
FAR  =  0.01,  respectively.  At  FAR  =  0.05,  the  mid¬ 
range  correct  verification  rate  is  37.5%  for  SR815_20 
and  45.0%  for  SR815_20  compared  to  28.5%  for 


(a)  FAR  =  0.01 


LRi5_20,  resulting  in  an  improvement  of  31.6%  and 
57.9%,  respectively. 

A  large  improvement  of  the  verification  rate  occurs 
from  the  far  range  to  the  midrange,  but  the  improve¬ 
ment  is  visibly  smaller  from  the  midrange  to  the 
close  range.  For  SR8,  which  produced  effective  eye- 
to-eye  distances  ~2.8  times  the  original  size;  the  ver¬ 
ification  rate  exhibited  only  a  small  improvement 
from  the  midrange  to  the  close  range.  These  results 
are  consistent  with  the  findings  of  [1,2]  that  showed 
the  improvement  in  face  recognition  performance 
slowed  considerably  once  the  eye-to-eye  distance 
surpassed  approximately  30  pixels. 

To  generate  the  confidence  intervals  shown  in 
Fig.  9,  the  procedure  described  in  Subsection  2.E.3 
was  performed  using  the  similarity  matrix  S  for 
LR,  SR4,  and  SR8  at  each  of  the  three  ranges.  For 
the  far  range,  although  face  recognition  improves 
with  superresolution,  the  confidence  intervals  over¬ 
lap  for  LR,  SR4,  and  SR8,  suggesting  that  no  signif¬ 
icant  benefit  is  achieved  with  superresolution.  At  the 
close  range,  the  confidence  interval  for  SR4  exhibits  a 
partial  overlap  with  that  of  LR,  and  the  confidence 
interval  for  SR8  exhibits  only  a  slight  overlap  with 
that  LR.  The  small  overlaps  suggest  that  super¬ 
resolution  improves  recognition  rates  for  face  recog¬ 
nition  with  high  reliability,  especially  when  using 
eight  frames.  At  the  midrange,  the  confidence  inter¬ 
val  for  SR4  partially  overlaps  with  that  of  LR,  and  the 
confidence  interval  for  SR8  exhibits  no  overlap  at  all 
with  that  of  LR.  The  lack  of  any  overlap  demonstrates 
that  the  face  recognition  performance  improvement 
achieved  with  superresolution  using  eight  frames  is 
not  only  highly  reliable,  but  is  also  significant  for 
the  midrange  where  eye-to-eye  distance  is  between 
15-20  pixels. 

D.  Face  Recognition  Performance  of  Individual  Frames 
and  Decision  Level  Fusion 

To  examine  the  LRPCA  face  recognition  performance 
of  individual  frames  within  the  LR  sequence,  the 
ROC  curves  of  each  LR  frame  are  computed  and 
shown  in  Figs.  10-12.  The  ROC  curve  of  the 
simple  decision  level  fusion  method  derived  from 
averaging  the  similarity  matrices  across  the  eight 


(b)  FAR  =  0.05 


Fig.  9.  (Color  online)  Performance  as  a  function  of  range  at  FARs  of  (a)  0.01  and  (b)  0.05.  Error  bars  show  the  95%  confidence  interval  for 
each  correct  verification  rate. 
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Performance  at  5-10  Pixels  Eye-to-Eye  Distance 
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Fig.  10.  (Color  online)  ROC  curves  at  the  far-range  for  each  low  resolution  frame  (superscript  1-8).  The  ROC  curve  for  LRave  is  generated 
by  averaging  similarity  matrices  of  the  eight  individual  frames  and  generating  the  ROC  curve. 


frames  (LRave)  at  each  range  is  overlaid  onto  the 
plots.  Averaging  the  similarity  matrices  exploits 
the  spatial  information  across  the  eight  frames  at 
the  decision  level  for  face  recognition  and  is  com¬ 
pared  against  face  recognition  with  superresolution 
(SR8). 

Figures  10-12  show  the  ROC  curves  for  each  of  the 
8  LR  query  sets  corresponding  to  different  frames  at 
the  far,  mid-,  and  close  ranges,  respectively  The  ROC 
curve  for  the  average  similarity  scores  (LRave)  across 
frames  is  shown  in  bold  red  along  with  the  SR8  ROC 
curve  shown  in  bold  blue.  The  ROC  curves  for  the 
LR  frames  exhibit  some  variation,  but  tend  to  be 
clustered  together  and  lay  within  the  confidence  in¬ 
tervals  as  computed  in  Subsection  3.C.  Note  that 
there  is  no  observable  ordering  of  the  ROC  curves 


for  LR1  to  LR8  from  lowest  to  highest.  This  verifies 
that  the  scale  change  across  the  eight  frames  as  sub¬ 
ject  walks  towards  the  camera  is  minor  and  does  not 
produce  any  patterns  in  the  ordering  of  the  ROC 
curves.  At  the  far  range,  the  ROC  curve  for  the  first 
frame  (LRj];_10)  interestingly  lay  above  the  seven 
other  LR  frames  which  closely  overlap  with  each 
other.  This  may  be  due  to  the  definition  of  the 
eye  coordinates  based  on  the  first  frame,  which  were 
then  reused  for  the  remaining  seven  frames  at 
the  far  range.  This  eye  coordinate  definition  proce¬ 
dure  may  have  produced  a  slightly  more  accurate  eye 
coordinate  selection  for  the  first  frame  than  the  other 
frames,  even  though  the  change  in  eye  coordinates 
did  not  exceed  an  integer  pixel  across  the  frames 
in  the  sequence  at  the  far  range. 


Performance  at  15-20  Pixels  Eye-to-Eye  Distance 


False  Accept  Rate 


Fig.  11.  (Color  online)  ROC  curves  at  the  midrange  for  each  low  resolution  frame  (superscript  1-8). 
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Fig.  12.  (Color  online)  ROC  curves  at  the  close-range  for  each  low  resolution  frame  (superscript  1-8). 


The  ROC  curves  for  LRave  in  general  lay  above  the 
ROC  curve  of  any  individual  LR  frame.  Since  LRave  is 
a  decision  level  fusion  in  exploiting  the  spatial  infor¬ 
mation  across  the  eight  frames,  it  is  not  unexpected 
that  the  ROC  for  LRave  tends  to  be  an  upper  bound 
for  the  ROC  curve  of  any  individual  frame.  However, 
the  ROC  curve  for  SR8  lay  above  the  ROC  curve 
of  LRave  at  all  three  ranges,  showing  that  super¬ 
resolution  image  reconstruction  is  a  more  effective 
method  in  exploiting  the  spatial  information  across 
the  temporal  dimension  to  improve  face  recognition 
performance. 

To  provide  a  quantitative  measure  of  overall  face 
recognition  performance  of  the  LRPCA  face  recogni¬ 
tion  algorithm  using  superresolved  and  LR  imagery, 
the  area  under  the  curve  (AUC)  is  computed  for  each 
ROC  curve  in  Figs.  10-12  across  FAR  e  [0, 1]  and  ta¬ 
bulated  in  Table  3.  Note  that  the  maximum  possible 
value  for  the  AUC  is  1.  The  AUCs  for  LR  are  gener¬ 
ally  consistently  close  to  each  other  across  frames 
1-8  at  the  three  ranges.  The  “best  frame”  in  terms 
of  AUC  is  the  1st  frame  for  the  far  range,  8th  frame 
for  the  midrange,  and  6th  frame  for  the  close  range  as 
underlined  in  Table  3.  The  AUC  for  SR8  is  5.1%  lar¬ 
ger  than  the  best  frame  at  the  far  range,  7.65%  larger 
than  the  best  frame  at  the  midrange,  and  5.04%  lar¬ 
ger  than  the  best  frame  at  the  close  range.  Further¬ 
more,  the  AUC  for  SR8  is  14.36%  larger  than  LRave  at 
the  far  range,  2.88%  larger  than  LRave  at  the  mid¬ 
range,  and  1.38%  larger  than  LRave  at  the  close 


range.  Note  that  although  the  AUC  for  SR8  is  only 
a  few  percent  better  than  LRave  at  the  mid-  and  close 
ranges,  the  increase  is  still  substantial  as  the  AUC 
was  computed  over  the  whole  range  of  FARs 
(FAR  e  [0, 1]);  typically,  ROC  curves  for  detection/ 
classification  algorithms  in  a  given  experiment  over¬ 
lap  at  higher  FARs  (ex.  FAR  >  0.1).  Therefore,  the 
results  of  Table  3  show  that  superresolution  provides 
improvement  in  LRPCA  face  recognition  perfor¬ 
mance  compared  to  any  frame  as  well  as  to  the  deci¬ 
sion  level  fusion  across  the  eight  frames. 

Using  the  best  frames  in  terms  of  AUC  from  Table  3 
at  each  range  (denoted  LR*),  verification  rates  with 
respect  to  range  are  plotted  in  Fig.  13  at  FAR  of  0.01 
and  0.05.  SR8  outperforms  LR*  as  well  as  LRave,  with 
small  to  no  overlap  of  confidence  intervals  at  the  mid- 
and  close  ranges.  At  the  midrange,  the  verification 
rate  is  0.45  for  SR8,  0.37  for  LRave,  and  0.31  for 
LR*,  representing  a  21.6%  improvement  and  a  45.2% 
improvement  over  LRave  and  LR*  at  FAR  =  0.05,  re¬ 
spectively.  At  the  close  range,  although  the  perfor¬ 
mance  improvement  achieved  with  SR8  is  not  as 
significant  as  the  midrange,  the  benefit  of  super¬ 
resolution  is  still  substantial. 

For  surveillance  systems  on  residential  and  com¬ 
mercial  properties  where  low  cost  cameras  are 
prevalent,  faces  of  individuals  captured  on  camera 
are  commonly  between  15  and  30  pixels  across  in 
terms  of  eye-to-eye  distance  which  correspond  to 
the  examined  mid-  and  close  ranges.  Superresolution 


Table  3.  Area  under  the  Curves  for  LR1,  Where  i  e  [1,8]  Denotes  the  Frame  Number,  AUC  for  LRave  (Computed  from  the  ROC  of  the  Average 
Similarity  Scores  across  the  Eight  Frames),  and  AUC  for  SR8  at  Far,  Mid-,  and  Close  Ranges3 


LR1 

LR2 

LR3 

LR4 

LR5 

LR6 

LR7 

LR8 

LRave 

SR8 

Far  range 

0.6200 

0.5553 

0.5339 

0.5655 

0.5311 

0.5474 

0.5583 

0.5566 

0.5697 

0.6516 

Midrange 

0.7529 

0.7514 

0.7557 

0.7542 

0.7467 

0.7411 

0.7625 

0.7756 

0.8115 

0.8349 

Close  range 

0.7559 

0.7428 

0.7503 

0.7352 

0.7584 

0.7823 

0.7805 

0.7679 

0.8105 

0.8217 

“Underlined  numbers  denote  the  best  frame  in  terms  of  AUC  at  each  range. 
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Fig.  13.  (Color  online)  Performance  as  a  function  of  range  at  FARs  of  (a)  0.01  and  (b)  0.05.  Error  bars  show  the  95%  confidence  interval  for 
each  correct  verification  rate  (not  shown  for  LRave  because  LRave  represents  averaged  similarity  scores,  and  not  actual  similarity  mea¬ 
surements).  LR*  denotes  the  best  of  the  eight  LR  frame  (in  terms  of  AUC)  at  each  range. 


is  expected  to  provide  significant  benefits  in  enhan¬ 
cing  the  LR  face  images  and  improving  facial  recog¬ 
nition  performance. 

4.  Conclusion 

Using  a  video  database  similar  to  real  world  surveil¬ 
lance  footage,  this  study  shows  that  superresolution 
provides  considerable  benefits  for  the  state  of  the  art 
baseline  LRPCA  face  recognition  algorithm  at  the 
examined  mid-  and  close  ranges.  In  surveillance  ap¬ 
plications,  low-cost  cameras  and  oftentimes  the  far 
distance  of  individuals  result  in  a  very  limited  num¬ 
ber  of  face  pixels,  severely  affecting  face  recognition 
performance.  Superresolution  image  reconstruction 
can  be  used  to  enhance  the  high-frequency  content 
of  low  resolution  surveillance  imagery,  improving 
face  recognition  performance  and  potentially  aiding 
the  nation  in  law  enforcement  and  homeland  secur¬ 
ity  applications. 

The  authors  thank  Professor  Ross  Beveridge, 
David  Bolme,  Stephen  Won,  and  Martha  Givan  for 
their  help,  as  well  as  the  reviewers  for  their  valuable 
comments  and  suggestions. 
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rate  and  confusion  matrices  on  the  well  known  Comanche  (Boeing-Sikorsky,  USA)  forward-looking  IR 
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1.  Introduction 

The  objective  of  an  automatic  target  recognition 
(ATR)  algorithm  is  to  detect  [1]  and  classify  each  tar¬ 
get  image  into  one  of  a  number  of  classes  [2] .  The  re¬ 
cognition  algorithm  may  consist  of  several  stages. 
For  example,  in  the  first  stage  a  target  is  detected 
on  the  entire  image;  in  the  second  stage,  background 
clutter  is  removed;  in  the  third  stage,  a  set  of  features 
is  computed  and  finally,  in  the  fourth  stage,  classifi¬ 
cation  is  done  by  means  of  a  classifier.  In  this  paper, 
we  mainly  focus  on  the  last  two  stages. 

Target  recognition  using  forward-looking  IR 
(FLIR)  imagery  of  different  targets  in  natural  scenes 
is  difficult  due  to  large  variations  in  the  thermal  sig¬ 
natures  of  targets.  Many  ATR  algorithms  have  been 
proposed  for  FLIR  imagery.  Wang  et  al.  proposed  a 
modular  neural-network-based  ATR  algorithm  in 
[2] .  In  their  algorithm,  several  neural  networks  are 
trained,  each  optimized  for  a  local  region  in  the  im¬ 
age,  whose  classification  decisions  are  combined  to 
determine  the  final  classification.  Wavelet-based  vec¬ 
tor  quantization  was  used  for  FLIR  ATR  in  [3]  by 
Chan  and  Nasrabadi,  where  a  discriminative  diction¬ 
ary  was  created  in  the  wavelet  domain  using  learn- 
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ing  vector  quantization.  A  recognition  method  based 
on  hidden  Markov  tree  that  uses  a  Karhunen-Loeve 
representation  was  proposed  by  Bharadwaj  and  Car- 
in  in  [4] .  See  [5]  for  an  excellent  survey  of  papers  and 
experimental  evaluation  of  FLIR  ATR.  The  algo¬ 
rithms  evaluated  in  [5]  include  convolutional  neural 
network  (CNN),  principal  component  analysis 
(PCA),  linear  discriminant  analysis  (LDA),  learning 
vector  quantization  (LVQ),  modular  neural  net¬ 
works  (MNN),  and  two  model-based  algorithms, 
using  Hausdorff  metric-based  matching  (H-M)  and 
geometric  hashing  (G-H). 

FLIR  images  often  contain  unwanted  thermal  sig¬ 
natures  of  the  background  clutter  whose  characteris¬ 
tics  change  with  environment  such  as  changes  in  fog, 
rain,  and  heat,  which  can  make  target  detection  and 
recognition  difficult  for  automated  as  well  as  human 
observers.  Recently,  Wright  et  al.  [6]  introduced  a 
sparse-representation-based  classification  (SRC)  al¬ 
gorithm  for  face  recognition,  which  is  claimed  to  be 
robust  to  varying  expressions,  illumination,  occlu¬ 
sion,  and  disguise,  and  has  been  shown  to  outperform 
many  state-of-the-art  algorithms.  This  approach  is 
based  on  the  theories  of  compressive  sensing  (CS) 
and  sparse  representation  (SR).  The  idea  is  to  create 
a  dictionary  matrix  of  the  training  samples  as  col¬ 
umn  vectors.  The  test  sample  is  also  represented  as 
a  column  vector.  Different  dimensionality-reduction 
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methods  are  used  to  reduce  the  dimensions  of  both 
the  test  vector  and  the  vectors  in  the  dictionary. 
One  such  approach  for  dimensionality  reduction  is 
using  random  projections  [6].  Random  projections, 
using  a  generated  sensing  matrix,  are  taken  of  both 
the  dictionary  matrix  and  the  test  sample.  It  is  then 
simply  a  matter  of  solving  an  minimization  pro¬ 
blem  in  order  to  obtain  the  sparse  solution.  Once 
the  sparse  solution  is  obtained,  it  can  provide  infor¬ 
mation  as  to  which  training  samples  the  test  vector 
most  closely  relates  to.  Furthermore,  it  was  shown 
that  if  the  sparsity  of  the  solution  is  properly  har¬ 
nessed,  the  choice  of  features  (e.g.,  dimensionality- 
reduction  method)  is  no  longer  critical.  The  number 
of  features  for  a  given  class  and  the  sparse  solution 
become  critical. 

Motivated  by  the  SRC  algorithm,  in  this  paper, 
we  investigate  the  effectiveness  of  SR  and  CS  for 
the  recognition  of  FLIR  target  images.  In  particular, 
we  exploit  the  inherent  block  structure  of  the  sparse 
solution  induced  by  minimization.  Furthermore, 
our  method  utilizes  a  redundant  dictionary  that  in¬ 
cludes  training  data  at  various  azimuth  angles, 
hence  achieving  orientation  invariance.  As  a  result, 
our  algorithm  has  the  ability  to  identify  targets  at 
different  orientations. 

This  paper  is  organized  as  follows:  the  theory  of 
sparse  representation  along  with  its  use  for  ATR  is 
presented  in  Section  2.  Its  extensions  based  on  block 
sparsity  (BS)  are  presented  in  Section  3.  In  Section  4, 
we  present  some  experimental  results  on  a  FLIR 
data  set  consisting  of  ten  different  targets.  Section  5 
concludes  the  paper  with  a  brief  summary  and 
discussion. 

2.  Recognition  Based  on  Sparse  Representation 

Following  [6,7],  in  this  section  we  briefly  describe  the 
use  of  SR  and  CS  for  FLIR  ATR.  Figure  1  shows  the 
overview  of  our  method. 


A.  Sparse  Representation 

Suppose  that  we  are  given  L  distinct  target  classes 
and  a  set  of  n  training  images  per  class.  We  identify 
an  l  xp  gray-scale  image  as  an  W-dimensional  vector 
that  can  be  obtained  by  lexicographically  stacking  its 
columns,  where  N  =  Ip.  Let  Ak  =  [x&l5  ...,xkn\  be  an 
N  x  72  matrix  of  training  images  from  the  &th  class. 
That  is,  Ak  represents  the  dictionary  for  class  k. 
Define  a  new  matrix  or  dictionary,  A,  as  the  concate¬ 
nation  of  subdictionaries  from  all  the  classes  as 

A  =  [A1? ...,  AL]  eRNx^ 

=  [Xll,  ....Xinlx-Ji,  ...,X2n| . |xL1,  ...,xin]. 

We  consider  an  observation  vector,  y  e  RA',  of  un- 
known  class  as  a  linear  combination  of  the  training 
vectors  as 

y  =  EEa^  (!) 

i= 1  j=  1 

with  coefficients  atj  E  R.  The  above  equation  can  be 
more  compactly  written  as 

y  =  A  a,  (2) 

where 

a=  [a11,...,aln\a2i,...,a2n\ . \aL1, aLn]T  (3) 

and  .T  denotes  the  transposition  operation.  Now  we 
make  an  assumption  that  given  sufficient  training 
samples  of  the  kth  class,  Ak,  any  new  test  image  y  E 
RN  that  belongs  to  the  same  class  will  approximately 
lie  in  the  linear  span  of  the  training  samples  from  the 
class  k.  This  implies  that  most  of  the  coefficients  not 
associated  with  class  k  in  Eq.  (3)  will  be  close  to  zero. 
Hence,  a  is  a  sparse  vector. 

In  order  to  represent  an  observed  vector,  y  E  RN, 
as  a  sparse  vector,  a ,  one  needs  to  solve  the  system 
of  linear  equations  in  Eq.  (2).  Typically,  L.n^N, 
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Fig.  1.  (Color  online)  Overview  of  our  approach  using  SR.  Test  target  chip  is  represented  as  a  linear  combination  of  image  chips  from  a 
dictionary  containing  all  training  images. 
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which  makes  the  system  of  linear  equations  in  Eq.  (2) 
underdetermined  and  has  no  unique  solution.  It  has 
been  shown  that  if  a  is  sparse  enough  and  A  satisfies 
certain  properties,  then  the  sparsest  a  can  be  recov¬ 
ered  by  solving  the  following  optimization  problem 
[8-12], 

a  =  argmin||cr/||1  subject  to  y  =  A  a',  (4) 

a! 

where  \\x\\1  =  1(^)1-  This  problem  is  often  known 

as  basis  pursuit  and  can  be  solved  in  polynomial  time 
[13] .  Note  that  the  norm  is  an  approximation  of  the 
t0  norm  [14].  The  approximation  is  necessary  be¬ 
cause  the  optimization  problem  in  Eq.  (4)  with  the 
t0  norm  (which  seeks  the  sparsest  a)  is  NP-hard 
and  is  computationally  difficult  to  solve.  In  the  case 
in  which  noisy  observations  are  given,  basis  pursuit 
denoising  (BPDN)  can  be  used  to  approximate  a, 

a=  argminlla'll!  subject  to  ||y-  Aa'\\2  <  £,  (5) 

a' 

where  we  have  assumed  that  the  observations  are  of 
the  following  form: 

y  =  A  a  +  rj  (6) 

with  || //|| 2  <  e.  One  condition  that  is  required  for  both 
the  t0  norm  based  method  and  the  norm  based 
method  to  have  the  same  solution  and  for  Eq.  (5) 
to  stably  approximate  the  sparsest  near  solution  of 
Eq.  (6)  is  known  as  the  restricted  isometry  property 
(RIP)  [10-12].  A  matrix,  A,  is  said  to  satisfy  the  RIP 
of  order  K  with  constants  SK  E  (0, 1]  if 

(1  -<5^)||y||l  <  IIA^Hl  <  (1  +  ^)||^|||  (7) 

for  any  v  such  that  \\v\\0  <K.  Also,  in  certain  cases, 
greedy  algorithms,  such  as  orthogonal  matching 
pursuit  [15],  can  also  be  used  to  recover  sparse  repre¬ 
sentations  of  images. 

B.  Feature  Extraction 

For  a  recognition  method  to  work  well,  one  needs  to 
find  good  features  that  can  separate  the  classes  in 
lower  dimensional  spaces,  often  known  as  the  feature 
space.  The  method  presented  in  the  previous  section 
can  be  easily  extended  to  the  case  in  which  different 
features  are  used.  Let  G  E  RMxN  be  a  matrix  with 
M  <N  that  maps  the  vectors  from  the  original  space 
to  the  feature  space.  Here  we  have  assumed  that  the 
transformation  is  approximately  a  linear  operation. 
Some  examples  of  G  include  random  projection  (RP) 
matrix,  downsampling  matrix,  PCA  dimensionality 
reduction  matrix  or  some  other  orthogonal  basis  such 
as  wavelet  transform  matrix.  Equation  (6)  can  be 
rewritten  as 

y=Gy  =  GA  (HijE#,  (8) 

where  rj  =  Grj.  In  general,  M  is  chosen  to  be  much 
smaller  than  N.  This  in  turn  implies  that  the  system 


of  Eqs.  (8)  will  be  underdetermined.  So  long  as  a  is 
sparse  enough  and  GA  satisfies  certain  conditions, 
one  can  approximate  a  by  solving  the  following 
problem: 

a  =  argminlla'llx  subject  to  ||y  -  GAa'I^  <  e,  (9) 

a! 

with  ||  fj  ||2  <  e. 

C.  Sparse  Recognition 

Given  an  observation  vector  y  from  one  of  the  L 
classes  in  the  training  set,  we  compute  its  coefficients 
a  by  solving  either  Eq.  (4)  or  Eq.  (5).  We  perform  clas¬ 
sification  based  on  the  fact  that  high  values  of  the 
coefficients  a  will  be  mainly  associated  with  the  col¬ 
umns  of  A  from  a  single  class.  We  do  this  by  compar¬ 
ing  how  well  the  different  parts  of  the  estimated 
coefficients,  a,  represent  y.  The  minimum  of  the  re¬ 
presentation  error  or  the  residual  error  is  then  used 
to  identify  the  correct  class.  The  residual  error  of 
class  k  is  calculated  by  keeping  the  coefficients  asso¬ 
ciated  with  that  class  and  setting  the  remaining  coef¬ 
ficients  not  associated  with  class  k  to  zero.  This  can 
be  done  by  introducing  a  characteristic  function, 
Xk  ^n,  which  selects  the  coefficients  associated 

with  the  &th  class  as  follows: 

r*(y)  =■  l|y-A^(a)||2.  (10) 

Here,  the  vector  Xk  has  value  one  at  locations  cor¬ 
responding  to  class  k  and  zero  for  other  entries.  The 
class,  d,  that  is  associated  with  an  observed  vector  is 
then  declared  as  the  one  that  produces  the  smallest 
approximation  error: 

d  =  arg  minify).  (11) 

k 

D.  Image  Quality  Measure 

From  the  previous  discussion,  one  would  expect  the 
solution  a  to  be  sparse  and  that  it  should  belong  to 
only  one  class.  For  instance,  the  test  image  from  Tar¬ 
get  1  should  only  belong  to  the  corresponding  Target 
1  class  rather  than  a  combination  of  different  classes. 
To  measure  the  quality  of  the  coefficient  vector,  a,  the 
notion  of  the  sparsity  concentration  index  (SCI)  [6] 
has  been  introduced.  The  SCI  of  a  coefficient  vector, 
a  E  R(L-n\  is  defined  as 

L.  max  \\/i  (or)  || !  _  ^ 

SCI(a)  = - ';lh  - .  (12) 

SCI  takes  values  between  0  and  1.  SCI  values  close 
to  1  correspond  to  the  case  in  which  the  test  image 
can  be  approximately  represented  by  using  only 
images  from  a  single  class.  In  this  case  the  test  vector 
has  enough  discriminating  features  of  its  class  and 
hence  has  high  quality.  If  SCI  =  0,  then  the  coeffi¬ 
cients  are  spread  evenly  over  all  classes.  So  the  test 
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vector  is  not  similar  to  any  of  the  classes  and  hence  is 
of  poor  quality. 

3.  Block-Sparsity-Based  Recognition 

The  following  problem  is  related  to  Eq.  (5)  and  is 
often  known  as  the  least  absolute  shrinkage  and 
selection  operator  (LASSO)  [16]: 

min  ||y  -  Acr'|||  subject  to  \\a'  1^  <  t,  (13) 

where  r  >  0.  The  constrained  optimization  problems 
[Eq.  (5)  and  (13)]  are  closely  related  to  the  following 
unconstrained  optimization  problem: 

min^lly-Aa'Ill  +A||a'||1,  (14) 

where  X  is  a  nonnegative  parameter.  The  Lagrange 
multiplier,  X,  is  related  to  the  LASSO  parameter,  t, 
of  the  constraint  in  Eq.  (13)  and  to  the  reciprocal  of 
the  parameter  of  the  constraint  in  Eq.  (5).  Hence,  for 
appropriate  selections  of  e,  X,  and  r,  the  solutions  of 
Eqs.  (5),  (13),  and  (14)  coincide  [17].  That  is,  these 
formulations  can  all  be  used  to  identify  sparse 
approximate  solutions  to  the  underdetermined 
system  (2). 

It  has  been  observed  that  the  LASSO  method 
tends  to  select  a  single  sample  from  a  group  of  corre¬ 
lated  training  samples  [18].  However,  in  our  applica¬ 
tion,  we  have  L  target  classes.  Hence,  our  dictionary, 
A,  consists  of  blocks  of  training  samples  correspond¬ 
ing  to  L  different  targets.  Thus,  the  resulting  sparse 
coefficients,  a ,  occur  in  a  block.  This  means  that  we 
can  use  a  regularization  method  that  selects  an  en¬ 
tire  block  of  correlated  training  samples  belonging  to 
the  same  class.  This  can  be  achieved  by  adapting  the 
following  relaxation: 

a  =  min  Max ||2  +  l|a2 II2  +  •••  +  ||aL||2 

(15) 

subject  to  ||y  -  Aa'\\  <  e , 

where  a;  =  (a'{i_1)n+v c/(i_1)n+2,  ...,c/(in))  for  i  =  1,2, 

L.  Note  that  this  method  requires  the  labels  of  each 
group  in  A.  This  presents  no  obstacles,  because  the 
label  of  each  training  sample  is  known  a  priori.  It 
is  shown  in  [19]  that  the  solution  of  Eq.  (15)  satisfies 

II a  -  a\\2  <  CiK~i\\a  -  oP ||2,y  +  C2e , 

provided  that  ||y  -  Aa||2  <  e  and  52k\j  <  -  1, 

where  Ci  and  C2  are  some  constants,  oP  denotes 
the  best  block  K  sparse  approximation  to  a,  and 
IMI2 ,j  =  Td= 1 IWI2,  and  J  =  {Ji}f=1  is  the  partition 
of  the  set  {1, 2,  ...,L},  that  is,  {Jf=1  =  {1, 2,  ...,L} 

and  Ylf  i  \Ji\  =  n-L-  Here  the  block-restricted  isome¬ 
try  constant  8K\j  is  defined  as  the  smallest  8K\j  such 
that  A  satisfies 


(1  -  <^)|Ml!  <  ||Ay |||  <  (1  +  ^|y)IMl!  (16) 

for  any  v  that  is  block  K  sparse  over  J.  It  follows  that 
8K\j  <  8k,  where  8K  is  the  conventional  restricted  iso¬ 
metry  constant  [e.g.,  Eq.  (7)],  corresponding  to  nonBS 
vectors  [19].  This,  in  turn,  implies  the  existence  of 
improved  performance  guarantees  compared  to 
BPDN  [19,20] .  For  this  reason,  in  our  ATR  algorithm, 
we  use  Eq.  (15)  to  harness  the  underlying  BS  struc¬ 
ture  of  a.  In  recent  years,  an  enormous  amount  of 
research  has  been  done  regarding  related  regulariza¬ 
tion  methods.  It  is  also  possible  to  use  grouped 
LASSO  [21,22]  and  elastic  net  [18]  for  regularization 
in  our  recognition  method.  Note  that  the  reconstruc¬ 
tion  of  BS  signals  from  the  compressive  measure¬ 
ments  has  been  studied  extensively  in  [20,23].  Our 
ATR  algorithm  based  on  SR  is  summarized  in  Fig.  2. 

4.  Experimental  Results 

In  this  section,  we  present  some  simulation  results  of 
different  ATR  methods  promoting  sparsity  on  the 
Comanche  (Boeing-Sikorsky,  USA)  FLIR  data  set 
consisting  of  different  military  targets  at  different  or¬ 
ientations.  The  images  are  of  size  40  x  75  pixels.  In 
all  of  our  experiments,  the  dimension  of  each  target 
image  (chip)  was  reduced  from  40  x  75  to  16  x  16  un¬ 
less  otherwise  stated.  A  number  of  approaches  have 
been  suggested  for  solving  BS-promoting  optimiza¬ 
tion  problems  (15).  In  our  approach,  we  employed 
a  highly  efficient  algorithm  that  is  suitable  for  large 
scale  applications,  known  as  the  spectral  projected 
gradient  (SPGL1)  algorithm  [17].  The  threshold  va¬ 
lue  for  SCI  was  set  equal  to  0.15.  The  performance  of 
our  algorithm  is  compared  with  that  of  several  differ¬ 
ent  methods  reported  in  [2,3,5].  Our  algorithm  is  also 
tested  using  several  features,  namely  PCA  features, 
RP  features,  two-dimensional  Haar  wavelet  features, 
and  downsampled  images. 

A.  Data  Set 

In  our  data  set,  there  are  10  different  vehicle  targets. 
We  will  denote  these  targets  as  TG1,TG2,  ...,TG10. 
For  each  target,  there  are  72  orientations,  corre¬ 
sponding  to  the  aspect  angles  of  0°,  5°, ...,  355°  in  azi¬ 
muth.  The  range  to  all  the  targets  is  given  so  that  all 
the  target  chips  are  analyzed  at  2  km.  The  data  con¬ 
sist  of  a  training  set  and  a  test  set.  We  will  refer  to 
the  training  set  as  the  SIG  set  and  the  test  set  as  the 
ROI  set.  The  SIG  data  set  has  about  13,816  target 
chips,  while  there  are  3353  images  in  the  ROI  data 

Given  a  matrix  of  training  samples  A  €  ]RN><Cn  O  for  L 

classes  and  a  test  sample  y  6  : 

1.  Solve  the  optimization  problem  (15)  to  obtain  a. 

2.  Compute  the  reconstruction  error  using  (10). 

3.  Identify  y  using  (11)- 

Fig.  2.  SR-based  ATR  algorithm. 
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TGI  TG2  TG3  TG4  TG5 


TG6  TG7  TG8  TG9  TG10 

Fig.  3.  Side  view  of  all  10  targets  present  in  the  SIG  data  set. 


set.  The  SIG  data  set  consists  of  images  that  were 
collected  under  very  favorable  conditions.  The  SIG 
data  set  contains  874  to  1468  images  per  target  class 
spanned  over  72  different  aspects.  In  Fig.  3,  we  dis¬ 
play  the  side  view  of  all  the  10  targets  present  in  the 
SIG  set. 

The  ROI  set  consists  of  only  five  targets,  namely 
TGI,  TG2,  TG3,  TG4,  and  TG7.  The  target  images 
for  the  ROI  set  were  taken  under  less  favorable  con¬ 
ditions,  such  as  targets  with  different  weather  condi¬ 
tions,  in  different  backgrounds,  in  and  around 
clutter;  hence  these  data  are  very  challenging.  There 
are  577  to  798  images  for  each  of  these  five  target 
classes.  Some  of  the  target  chips  from  the  ROI  data 
set  are  shown  in  Fig.  4.  All  the  images  in  the  SIG  and 
ROI  sets  were  normalized  to  a  fixed  range  with  the 
target  put  approximately  in  the  center.  The  orienta¬ 
tion  in  the  ROI  set  was  given  very  coarsely;  every  45°. 

B.  Results  on  SIG  Data  Set 

In  the  first  set  of  experiments,  the  training  and  test 
images  were  chosen  from  the  SIG  data  set.  For  train¬ 
ing,  we  randomly  chose  11  target  chips  for  each  tar¬ 
get  per  aspect  angle,  called  TRAIN-SIG.  Because  we 
have  a  total  of  72  aspects  (i.e.,  0°,  5°, ...,  355°)  for  each 
target,  we  used  a  total  ofllx72  =  792  targets  per 
class.  Hence,  the  resulting  dictionary,  A,  is  of  size 
256  x  7920.  Another  set  of  1000  targets,  disjoint  from 
the  training  data,  called  TEST-SIG,  was  used  for 
testing.  We  solve  the  following  problem  favoring  BS 
based  on  target  class  per  aspect: 

a  =  min  Max ||2  +  ll^lta  +  •••  +  l|a720 II2 
subject  to  ||y  -  Acr'||  <  e. 

where  a,  =  (a'{i_1)n+v a'(i_1)n+2, ... ,a'^n) )  for  i  =  1,2, 

11.  Once  the  BS  vector,  a,  is  found,  we  compute  the 


Fig.  4.  Some  sample  target  chips  from  the  ROI  data  set. 


reconstruction  error  using  Eq.  (10)  and  identify  the 
novel  target  chip,  y,  using  Eq.  (11).  We  applied  this 
BS-based  algorithm  to  various  features  on  the 
TRAIN-SIG  data  set.  Examples  of  different  features 
extracted  for  this  experiment  are  shown  in  Fig.  5. 

The  probabilities  of  correct  classification  for 
these  experiments  are  98.48%,  99.18%,  99.96%, 
and  99.95%  for  the  downsampled,  RP,  PCA,  and  Haar 
wavelet  features,  respectively.  All  the  features  per¬ 
formed  approximately  the  same  for  these  experi¬ 
ments.  The  confusion  matrices  [24]  corresponding 
to  these  experiments  are  shown  in  Figs.  6(a)-6(d). 

C.  Results  on  ROI  Data  Set 

In  the  second  set  of  experiments,  we  randomly  se¬ 
lected  11  targets  per  aspect  angle  from  the  SIG  data 
set  for  training.  The  resulting  dictionary,  A,  is  of  size 
256  x  7290.  We  randomly  selected  1000  images  from 
the  ROI  set  for  testing,  called  the  TEST-ROI  set. 
Again,  we  extracted  various  features  and  applied  our 
BS-based  algorithm  to  these  features  as  was  done  for 
the  TRAIN-SIG  data  set.  The  probabilities  of  correct 
classification  for  these  experiments  are  75.10,  76.30, 
78.89,  and  76.45%  for  the  downsampled,  RP,  PCA, 
and  Haar  wavelet  features,  respectively.  The  PCA 
features  gave  the  best  performance.  The  confusion 
matrices  corresponding  to  these  experiments  are 
shown  in  Figs.  7(a)  and  7(d).  In  these  experiments, 
the  TEST-ROI  set  contained  only  five  targets,  but 
all  of  the  outputs  were  active.  Note  that  we  have 
included  five  rows  with  zeros  for  clarity  due  to  the 
fact  that  five  other  targets  are  not  present  in  this 
data  set. 

The  best  recognition  results  on  the  TEST-SIG  and 
TEST-ROI  data  sets  were  obtained  by  using  the  PCA 
features.  Performance  of  our  algorithm  using  various 
features  on  TEST-SIG  and  TEST-ROI  is  compared  in 
Fig.  8.  Also,  we  report  the  performance  of  different 
techniques  [2,3,5]  on  these  data  sets  in  Table  1.  As 
can  be  seen  from  the  table,  our  method  achieves  re¬ 
cognition  rates  of  99.96%  and  78.89%  on  TEST-SIG 


(a)  (b)  (c)  (d)  (e) 

Fig.  5.  Examples  of  different  features  used  in  this  paper: 
(a)  original  target  chip,  (b)  Haar  wavelet  features,  (c)  downsampled 
image,  (d)  PCA  features,  (e)  random  projection. 
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Downsample+BS  on  TEST-SIG 


kP+BSon  TEST-SIG 


Guessed  Class 


Guessed  Class 


(a) 


(b) 


KCA+BS  oil  TEST-SIG 


Wavelet- BS  on  TEST-SIG 


Guessed  Class 


Guessed  Class 


(c)  Id) 

Fig.  6.  (Color  online)  Confusion  matrices  corresponding  to  the  SIG  data  set  using  different  features:  (a)  downsampled,  (b)  random 
projection,  (c)  PCA,  (d)  Haar  wavelet. 


and  TEST-ROI,  respectively,  and  it  outperforms 
other  methods  such  as  CNN,  MNN,  PCA,  LVQ, 
LDA,  H-M  and  G-H  [2,3,5].  Also,  note  that  our  meth¬ 
od  is  more  general  than  the  competing  methods  pre¬ 
sented  in  [2,3].  In  their  methods,  to  deal  with  the 
background  artifacts,  they  use  several  rectangular 
windows  of  different  sizes  based  on  the  ground  truth 
silhouette  computer-aided  design  models.  As  a  re¬ 
sult,  their  performance  significantly  depends  on 
the  choice  of  windows.  In  contrast,  the  method  pre¬ 
sented  here  does  not  require  any  windowing  or  prior 
knowledge  about  the  size  of  the  targets. 

D.  Target  Pose  Estimation 

Because  the  dictionary,  A,  contains  target  chips  at 
different  known  orientations,  we  can  reliably  esti¬ 
mate  the  pose  of  a  target  from  the  test  set.  To  illus¬ 
trate  this,  consider  a  test  target  chip  shown  in  Fig.  9. 
This  chip  belongs  to  TGI,  and  its  orientation  is  5°. 
The  values  of  the  sparse  coefficients  obtained  by  sol¬ 
ving  a  BS-promoting  problem  are  shown  at  the  bot¬ 
tom  of  Fig.  9.  Using  the  correct  labels  of  columns  in  A 


and  the  sparse  coefficients,  one  can  identify  the 
corresponding  aspect  angle  of  this  target. 

Note  that  such  pose  information  can  be  utilized  to 
exclude  the  background  in  each  target  chip  as  was 
done  in  [5].  Furthermore,  if  we  know  the  possible  or¬ 
ientation  of  a  target,  we  can  validate  the  target  type 
by  using  some  features  unique  to  the  target  in  that 
orientation  (see  [3,5]  for  details). 

E.  Recognition  Rate  versus  Feature  Dimension 

In  this  section,  we  show  how  the  performance  of  our 
algorithm  changes  as  we  change  the  feature  dimen¬ 
sion.  For  this  experiment  we  again  randomly  selected 
11  targets  per  aspect  angle  from  the  SIG  data  set  for 
training.  Another  set  of  1000  targets,  disjoint  from 
the  training  data  that  were  used  for  testing.  PCA  fea¬ 
tures,  was  used  for  this  experiment.  Figure  10  shows 
the  recognition  rates  for  this  experiment  correspond¬ 
ing  to  various  feature  dimensions.  As  can  be  seen 
from  this  figure,  the  recognition  rate  increases  as 
we  increase  the  feature  dimension.  Above  the  fea¬ 
ture  dimension  of  256,  the  recognition  rate  stays 
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Downsajinple+BS  on  TEST-ROl 


]  23456789  10 

tifiies^ed  Class 


I 


RP+BS  on  TEST-ROl 


1  234567S9I0 

Guessed  Class 


(a) 

PCA+BS  onTEST-ROI 


(b) 

Wavdel+BS  on  TEST-ROl 


Guessed  Class 


Guessed  Class 


(C)  (d) 

Fig.  7.  (Color  online)  Confusion  matrices  corresponding  to  the  ROI  data  set  using  different  features:  (a)  downsampled,  (b)  random 
projection,  (c)  PC  A,  (d)  Haar  wavelet. 


Performance  comparison  of  BS  with  different  features 


Features 


Fig.  8.  (Color  online)  Recognition  results  on  the  TEST-SIG  and 
TEST-ROl  sets  using  different  features. 


approximately  the  same.  This  is  no  surprise  because 
it  has  been  observed  by  many  researchers  that,  in 
practice,  features  (e.g.,  number  of  compressive  mea¬ 
surements)  of  the  order  of  3  to  5  times  the  number  of 
sparse  coefficients  suffice  for  a  good  recovery  [25,26]. 
From  our  assumption  that  any  test  image  that  be¬ 
longs  to  class  k  will  approximately  lie  in  the  linear 
span  of  the  training  samples  from  the  same  class, 
and  because  each  class  per  aspect  angle  contains 
11  training  images,  our  method  requires  the  feature 
dimension  to  be  more  than  5  x  11  =  55  for  good  re¬ 
covery  of  sparse  coefficients.  Hence,  increasing  the 
feature  dimension  more  than  is  required  by  the  li 
minimization  will  not  necessarily  improve  the  qual¬ 
ity  of  recovered  sparse  coefficients.  This  in  turn  im¬ 
plies  that  after  a  certain  feature  dimension,  the 
recognition  rate  will  approximately  stay  the  same 
[27].  Also,  from  the  previous  experiments,  we  see 
that  the  BS-based  recognition  algorithm  gives  ap¬ 
proximately  the  same  performance  when  different 
features  are  used,  provided  that  the  dimension  of 
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Table  1.  Recognition  Rates  (in  %)  for  Different  Methods 


Methods 

BS 

CNN4 

LVQ 

MNN 

PCA 

LDA 

H-M 

G-H 

TEST-SIG 

99.96 

95.16 

99.72 

95.49 

95.44 

86.92 

93.73 

80.24 

TEST-ROI 

78.89 

59.25 

75.12 

75.58 

52.17 

50.32 

62.86 

50.09 

Orientation  - *  5*  is*  25*  "■  355* 

Fig.  9.  (Color  online)  Target  orientation  detection.  Dictionary  matrix  A  contains  training  images  with  known  orientation.  This  can  be 
used  to  identify  the  aspect  angle  of  a  test  target. 


the  features  is  kept  high  enough.  This  shows  that  the 
choice  of  features  is  not  critical  but  the  dimension  of 
features  is. 


Recognition  rate  vs.  Feature  dimension 


Fig.  10.  (Color  online)  Recognition  rate  versus  feature 
dimension. 


5.  Discussion  and  Conclusion 

We  have  developed  a  framework  for  ATR  using  the 
theory  of  SR  and  CS.  This  entails  solving  a  BS- 
promoting  optimization  problem  on  various  features. 
Various  experiments  on  the  Comanche  (Boeing- 
Sikorsky,  USA)  FLIR  data  set  have  shown  promising 
results. 

Several  future  directions  of  inquiry  are  possible 
considering  our  new  approach  to  ATR.  For  instance, 
instead  of  using  the  minimization,  one  can  consid¬ 
er  greedy  pursuits  such  as  orthogonal  matching  pur¬ 
suit  and  compressive  sampling  matching  pursuit 
[15,28,29].  Greedy  pursuits  are  known  to  converge 
much  faster  than  optimization-based  methods  and 
have  the  same  theoretical  guarantees  as  some  of  the 
optimization-based  methods.  Even  though,  in  this 
paper,  we  took  a  reconstructive  approach  to  diction¬ 
ary  learning  for  ATR,  it  is  possible  to  learn  discrimi¬ 
native  dictionaries  for  the  task  of  target  recognition 
[30,31].  Note  that  the  sparsity-motivated  methods 
for  ATR  presented  here  for  FLIR  images  can  be 
easily  extended  to  the  other  ATR  problems  based 
on  ladar,  underwater  optical  imagery  [32],  or  syn¬ 
thetic  aperture  radar  imagery. 
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1  Introduction 

As  sensor  technology,  network  communication,  computing 
power,  and  digital  storage  capacity  have  all  dramatically 
improved,  still  and  video  imageries  have  become  the  most 
common  and  versatile  forms  of  media  for  capturing,  anal¬ 
yzing,  and  disseminating  a  variety  of  information.  Visible 
cameras  are  the  prevailing  imaging  sensors  because  they 
are  relatively  cheap,  easy  to  use,  and  capable  of  producing 
high-quality  imagery  under  favorable  conditions.  However, 
visible  cameras  can  be  severely  affected  by  common  envi¬ 
ronmental  factors  such  as  darkness,  shadows,  fog,  clouds, 
rain,  snow,  and  smoke.  Infrared  (IR)  imaging  systems  may 
overcome  or  alleviate  some  of  these  problems,  but  they  are 
subject  to  a  number  of  limitations  of  their  own.  IR-specific 
difficulties  include  a  much  lower  sensor  resolution;  total 
loss  of  nonthermal  but  important  visual  features  (such  as 
color  and  text);  blockage  by  visually  transparent  thermal 
signal  shields  (such  as  car  windshields  and  glass  doors); 
and  very  low  thermal  contrast  between  targets  and  back¬ 
ground  under  certain  combinations  of  ambient  and  target 
temperatures.  Due  to  these  highly  complementary  strengths 
and  limitations  of  visible  and  IR  cameras,  more  advanced 
target  detection  and  tracking  systems  may  want  to  acquire 
and  process  both  visible  and  IR  imageries  concurrently 
and  jointly  for  critical  surveillance  and  force  protection 
applications. 

To  study  the  usefulness  of  fusing  visible  and  IR  imageries 
for  detecting  and  tracking  moving  targets,  we  have  relied  on 
a  large  collection  of  concurrent  color  visible  and  long-wave 
IR  (LWIR)  video  sequences  that  are  officially  referred  to  as 
the  Second  Dataset  of  the  Force  Protection  Surveillance 
System  (FPSS).1  These  FPSS  video  sequences  were  col¬ 
lected  using  the  Sentry  Personnel  Observation  Device 
(SPOD)  that  includes  a  LWIR  microbolometer  and  a  color 
visible  camera.  The  LWIR  images  were  acquired  with  a  focal 
plane  array  (FPA)  of  320  X  240  pixels  in  resolution,  while 
the  color  visible  images  were  captured  at  the  resolution  of 
460  TV  lines.  Both  the  original  color  visible  and  LWIR 
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images  were  cropped  and  scaled  to  a  common  image  size 
of  640  X  480  pixels,  in  order  to  attain  a  coarse  level  of  cor¬ 
egistration  between  the  corresponding  color-LWIR  images 
captured  at  any  given  time. 

Image  fusion  can  be  performed  at  several  different  levels.2 
At  the  lowest  levels,  the  raw  image  data  can  be  fused.  This 
can  either  be  performed  on  the  original  signal  or,  more  likely, 
after  the  image  has  been  preprocessed  and  the  resulting  pixel 
values  are  used.  Pixel-level  fusion  is  very  common  due  to 
its  simplicity  and  universality,  and  it  is  the  focus  of  this 
work  as  well.  At  higher  levels,  feature-based  detection  uses 
structural  image  characteristics,  such  as  edges  and  corners, 
to  enhance  the  image.  For  example,  one  could  extract  the 
edge  information  from  a  pair  of  images  using  Sobel  filter 
and  fuse  the  images  based  on  the  edge  information. 
However,  this  approach  is  much  more  application- specific, 
often  requiring  an  understanding  of  the  image,  itself,  either 
through  direct  human  intervention  or  automatic  object  clas¬ 
sification  algorithms.  Therefore,  this  approach  requires  much 
more  complex  computation,  complicated  training  methods, 
and  nonreal-time  intervention.  One  example  of  a  higher 
level  fusion  system  uses  Bayesian  analysis  to  sum  the  prob¬ 
abilities  of  detected  human  silhouettes  falling  within  each 
pair  of  visible  and  infrared  images.  Oftentimes,  detections 
are  based  on  whether  the  probability  exceeds  a  predefined 
threshold.3  For  a  stationary  camera  installed  in  a  specific  set¬ 
ting,  training  such  a  system  may  be  feasible  because  its  back¬ 
ground  does  not  vary  significantly.  At  the  highest  level  of 
image  fusion,  symbolic  fusion  methods  are  often  heavily 
rule-based  and  rely  on  a  lot  of  prior  or  external  knowledge 
to  perform  the  image  fusion.  Nonetheless,  symbolic  image 
fusion  methods  can  carry  similar  tradeoffs  as  the  fusion 
methods  at  the  feature-level. 

There  are  many  ways  to  measure  performance  of  image 
fusion  algorithms,  including  subjective  analysis,  complex 
similarity  metrics,  signal-to-noise  ratio  (SNR),  and  tracking 
performance.  Motwani  et  al.  suggested  parameters  for  sub¬ 
jective  analysis,  but  they  concluded  that  subjective  measures 
were  not  particularly  helpful  for  tracking  systems,  except  in 
the  case  of  incorporating  human  feedback  into  the  detection 
loop.4  Cvejic  et  al.  discussed  a  number  of  objective  similarity 
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metrics,  including  the  Piella  metric,  Petrovic  metric,  and 
Bristol  metric.5  The  Piella  metric  measures  structured  simi¬ 
larity  (which  is  based  on  luminance,  contrast,  and  structure 
information)  over  local  window  regions  and  then  averages 
these  similarity  measures  over  all  windows.  Weighting  is 
given  to  the  relative  importance  of  each  input  image  toward 
the  fused  image,  window  by  window.  The  Petrovic  metric 
specifically  evaluates  edge  structure  (using  a  Sobel  edge 
operator)  by  determining  the  strength  of  edge  information 
retained  from  each  of  the  original  images  in  the  fused  image. 
The  Bristol  metric,  in  contrast  to  the  Piella  metric,  uses  a 
slightly  different  weighting  scheme  based  on  the  ratio  of 
covariances  between  the  original  and  fused  images.  Cvejic 
et  al.  compared  the  tracking  performance  of  a  particle  filter 
based  on  these  objective  metrics  and  found  that  the  tracking 
performance  was  actually  worsened  by  the  fusion  of  images. 
Mihaylova  et  al.,  of  the  same  research  group,  later  adopted 
a  performance  metric  of  normalized  overlapping  ground 
truth  and  tracking  system  bounding  boxes  in  their  work.6 
Their  results  showed  that  IR  images  alone  performed  just 
as  well  or  better  than  most  fusion  algorithms  (including 
contrast  pyramid,  dual-tree  complex  wavelet  transform,  and 
discrete  wavelet  transform)  in  tracking,  while  visible  spec¬ 
trum  images  lagged  behind  under  harsher  conditions  like 
occlusions. 

There  are  many  possible  methods  of  tracking  a  moving 
target,  including  background  subtraction,  optical  flow,  mov¬ 
ing  energy,  and  temporal  differencing.  Because  the  FPSS 
dataset  was  collected  with  a  stationary  SPOD  with  minimal 
background  interference,  we  decided  to  use  an  existing  FPSS 
tracker,7  which  is  based  on  background  subtraction  method, 
to  examine  the  tracking  performance  of  various  image  fusion 
methods.  Instead  of  the  FPSS  tracker,  one  of  many  other 
moving  target  tracking  algorithms  may  be  used  for  a  similar 
study,  as  well.  For  instance,  Trucco  and  Plakas  described  a 
wide  range  of  alternative  tracking  algorithms  in  their  paper.8 

In  the  next  section,  we  provide  brief  discussions  on  the  13 
image  fusion  methods  of  interest.  These  fusion  methods  fall 
into  two  broad  categories — simple  combination  and  pyramid 
structure.  A  brief  description  of  the  FPSS  tracker  is  provided 
in  Sec.  3,  while  the  experimental  results  on  the  tracking  per¬ 
formance  of  various  image  fusion  methods  are  presented  in 


Sec.  4.  Finally,  some  concluding  thoughts  are  given  in 
Sec.  5. 

2  Fusion  Methods 

In  this  paper,  we  focus  on  13  pixel-level  image  fusion  meth¬ 
ods,  ranging  from  the  simplest  pixels  averaging  method  to 
the  very  complicated  dual-tree  complex  wavelet  transform 
method.  There  are  other  interesting  but  less  popular  image 
fusion  algorithms,  including  one  that  relies  on  factorizing 
an  image  V  into  two  nonnegative  matrix  components,  W 
and  H,  with  W  representing  a  basis  optimized  for  represent¬ 
ing  V.9  Another  approach  to  image  fusion  is  to  use  training 
sets  and  supervised  classifiers,  as  explored  by  Chan.10  In 
this  work,  however,  we  assume  no  prior  training  data  are 
available. 

To  evaluate  the  image  fusion  algorithms  examined  here, 
we  used  all  FPSS  coarsely  registered  color  visible  and  LWIR 
images  as  input  data,  a  pair  of  which  is  shown  in  Fig.  1.  To 
allow  fusion  with  LWIR  images,  the  color  visible  (RGB) 
images  were  converted  to  grayscale  images  using  a  simple 
weighting  of  0.2989R  +  0.5870G  +  0.1 140#,  which  yielded 
the  intensity  value  but  removed  the  hue  and  saturation  infor¬ 
mation.11  For  many  automatic  target  detection  and  tracking 
algorithms,  it  is,  indeed,  more  efficient  to  process  grayscale 
images  internally,  while  providing  color  outputs  for  human 
consumption  only. 

2.1  Simple  Combinations 

The  most  intuitive  pixel-level  fusion  methods  examined  here 
are  simple  averaging,  intelligent  weighting,  and  selecting 
maximum  or  minimum  pixel  values  between  the  visible  and 
LWIR  images.  All  these  methods  involve  only  simple  pixel 
operations,  which  require  traversing  the  two  input  images  to 
be  fused  pixel-by-pixel,  leading  to  a  simple  0(mXn)  oper¬ 
ation  for  an  image  of  size  mxn.  Pixels  (Ii)^  and  (I2)^  in 
images  and  I2  need  only  be  compared  against  each  other 
once. 

In  the  first  fusion  method,  a  fused  image  If  was  generated 
through  simple  averaging  by  calculating  (I ^)  =  [(Ij )tj  + 

(1 2)17] /2,  and  the  resulting  If  is  shown  in  Fig.  2(a). 


Fig.  1  Example  of  a  color  visible  (a)  and  an  LWIR  (b)  image  in  the  FPSS  dataset. 
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(a) 


(b) 


Fig.  2  Fused  image  through  simple  average  (a)  and  PCA-weighted  average  (b). 


Because  the  visible  and  LWIR  images  have  differing  resolu¬ 
tions  and  salient  features,  this  method  tends  to  muddle  the 
details. 

We  boosted  the  influence  of  the  better  image  using  the 
principal  component  analysis  (PC A)  derived  from  the  covari¬ 
ance  matrix  between  the  two  input  images.  A  simple  way  to 
do  this  is  to  consider  each  image  as  a  single  vector  and  I2, 
creating  a  2  X  2  covariance  matrix  when  we  compute  the 
covariance  of  [Ij  I2].  A  resulting  eigenvector  provides 
the  weights  for  fusing  the  pixels:  (IA.  =  (Vk)i(Ii)ij  + 
(Vfc)2(1 2)ij,  where  represents  the  normalized  eigenvector 
associated  with  Ak,  the  larger  one  of  the  two  eigenvalues. 
Generally,  the  PCA-weighted  averaging  method  strongly 
favors  the  image  with  the  highest  variance,  which  may  or 
may  not  contain  more  informative  and  useful  details.  In  fact, 
this  selection  criterion  can  be  a  disadvantageous  one  when 
dealing  with  noisy  images.  As  shown  in  Fig.  2(b),  the  fused 
image  produced  by  this  method  closely  matches  the  original 
visible  spectrum  image  because  the  visible  image  has  more 
details  and  a  higher  variance. 

Choosing  the  maximum  pixel  value,  (I^)f  .  =  max[(I1)^-, 
(I2)/7],  from  a  pair  of  LWIR  and  visible  images,  as  shown  in 


Fig.  3(a),  may  be  appropriate  to  find  some  hidden  targets.  A 
man  may  be  occluded  in  the  visible  spectrum,  for  example, 
but  he  can  still  be  located  in  the  LWIR  image.  For  a  back¬ 
ground  subtraction  method,  it  may  be  desirable  to  boost  the 
relative  intensity  of  targets  through  this  fusion  method,  if 
these  targets  tend  to  be  brighter  than  their  immediate 
background. 

Choosing  the  minimum  pixel  value,  (I f)(.  =  min[(I1)^-, 
(I2)^-],  may  not  be  very  useful  in  general  because  it  tends 
to  deemphasize  the  strong  foreground  objects,  as  evident 
from  Fig.  3(b).  In  some  rare  occasions,  this  method  may  be 
helpful  in  extracting  weak  targets  (with  both  weak  but  detect¬ 
able  visible  and  LWIR  signatures)  from  busy  backgrounds 
by  deemphasizing  stronger  and  brighter  neighboring  back¬ 
ground  pixels. 

2.2  Pyramid  Structures 

Pyramid  decompositions  were  introduced  by  Burt  and 
Adelson  in  1983  as  a  compact  encoding  scheme.12  The  origi¬ 
nal  idea  is  that  a  Gaussian  kernel  (low-pass  filter)  is  applied 
to  the  top-level  image  of  a  pyramid,  *  Gls  representing  the 
convolution  of  the  image  Ix  with  a  Gaussian  blurring  matrix 


Fig.  3  Fusion  by  selecting  maximum  (a)  and  minimum  (b)  pixel  intensities. 
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This  image  is  then  down-sampled  to  form  the  next  level 
of  these  pyramids.  The  difference  between  the  low-pass 
version  and  its  previous-level  image  represents  the  high  fre¬ 
quency  or  detail  information  of  the  previous-level  image.  At 
each  step  down  the  pyramid,  we  continue  to  filter  and  down- 
sample  in  the  same  manner.  A  Laplacian  pyramid  is  formed 
by  computing  the  difference  between  each  level  of  the  pyra¬ 
mid,  iteratively  separating  an  image  into  low  and  high  fre¬ 
quency  components,  except  that  the  lowest  level  contains 
the  remaining  low-frequency  information. 

Since  each  level  is  a  down-sampled  version  of  the  pre¬ 
vious  level,  we  need  to  up-sample  and  interpolate  the  deci¬ 
mated  version  in  order  to  compute  the  difference  between 
the  two  adjacent  levels.  For  example,  the  Laplacian  image 
at  level  k  of  Im,  denoted  as  {Lm)k,  can  be  computed  as 
(Lm)t  =  (Im)*-/*+i[(Im)*+i],  where  fk+l[]  denotes  the 
function  consisting  of  up-sampling  and  an  interpolation  filter 
with  similar  blurring  response  as  G*,  while  k  denotes  the 
level  of  decomposition.  As  we  proceed  down  the  pyramid, 
(lm)k  denotes  the  blurred  and  decimated  version  of  (Im)^_1. 
By  decomposing  each  set  of  the  original  LWIR  and  visible 
images,  we  form  compact  representations  separated  into 
detail  and  approximation  information.  Hence,  we  can  then 
weight  the  coefficients  in  each  pyramid.  To  reconstruct  the 
fused  image,  we  then  reverse  the  decomposition  process, 
starting  with  a  synthesis  image  at  level  k+  1,  denoted  as 
( Sm)k+1 »  expanding  it,  and  adding  it  to  (Lm)k  to  get  (Sm)k. 
The  initial  synthesis  image  is  the  background  coefficients 
found  at  the  bottom  of  the  Laplacian  pyramid.  If  we  select 
the  maximum  coefficients  between  the  two  pyramids  by 
taking  max  {[(Lj )k]ij,  [(L2)  for  each  level  k,  and  all  ij 
coefficients  during  this  reconstruction  process,  then  a 
Laplacian  fused  image  is  generated  [see  Fig.  4(a)]. 

A  filter-subtract-decimate  (FSD)  pyramid  is  similar  to 
the  Laplacian  pyramid,  but  the  levels  are  subtracted  prior  to 
decimations.  This  makes  the  method  simpler  and  reduces 
delay,  therefore,  allowing  easier  real-time  implementation. 
Slight  frequency  distortions  are  introduced,  thus  a  correction 
factor  is  required  for  perfect  reconstruction.  This  term  can 
be  dropped  in  practice,  though  variations  can  make  minor 
adjustments  in  the  synthesis  phase  to  account  for  this. 


Figure  4(b)  shows  the  result  of  image  fusion  based  on  the 
original  FSD  technique  proposed  by  Anderson.13  Both 
images  in  Fig.  4  may  look  similar,  except  for  a  slight  shading 
difference,  but  their  differences  in  tracking  performance 
could  be  larger  than  that. 

The  ratio-of-low-pass  (ROLP)  pyramid  and  the  contrast 
pyramid  use  the  ratio  of  levels  of  the  Gaussian  pyramid  to 
compute  the  coefficients  at  the  next  level,  instead  of  their 
differences.14,15  Otherwise,  the  decomposition  process  re¬ 
sembles  that  of  the  Laplacian  pyramid.  Since  the  stored 
coefficients  are  not  used  to  compute  levels  of  the  Gaussian 
pyramids,  the  underlying  Gaussian  pyramid  decomposition 
of  the  image  does  not  change.  The  primary  difference 
between  ROLP  and  contrast  pyramids  is  the  use  of  a  local 
background  to  normalize  the  ratio.  The  contrast  pyramid 
computes  (Lm)k  =  {(Im)t//t+1[(Im)t+I]}  -  1,  and  the  off- 
set  of  1  is  reversed  during  reconstruction.  On  the  other  hand, 
the  ROLP  pyramid  computes  (Lm)t  =  (Im)*//*+i[(Im)t+i]. 
Instead  of  summing  coefficients  during  synthesis  (as  in  the 
case  of  Laplacian  pyramid),  we  now  reverse-decomposition 
by  expanding  (Sm)^+1  and  multiplying  it  with  (Lm)k  to  get 
(Sm)^.  A  small  epsilon  factor  is  often  added  to  the  denom¬ 
inator  to  prevent  division-by-zero  issues.  Figure  5  shows  the 
resulting  fused  images  from  the  ROLP  and  contrast  pyramid 
methods.  These  decomposition  methods  are  designed  to 
emphasize  the  contrast  in  an  image. 

The  gradient  pyramid  chooses  the  largest  directional 
derivative  in  each  of  four  directions:  horizontal,  vertical,  and 
the  two  diagonal  directions.16  These  derivatives  can  be  com¬ 
puted  using  simple  matrix  operators.  For  example,  at  each 
level  of  the  pyramid,  the  four  operators  [1  —2  1  ] , 
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Fig.  4  Fusion  by  selecting  the  maximum  coefficient  of  Laplacian  pyramids  (a)  and  FSD  pyramids  (b). 
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Fig.  5  Fusion  by  selecting  the  maximum  coefficient  of  the  ROLP  pyramids  (a)  and  contrast  pyramids  (b). 
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where  *  represents  the  convolution  operator.  Coefficients  are 
selected  for  each  of  the  four  directions  independently  during 
the  fusion  process  and  then  added  together  to  represent  the 
combined  gradient  strength  at  a  given  pixel  location.  A  syn¬ 
thesized  image  is  reconstructed  using  the  same  procedure  as 
in  the  Laplacian  pyramid  case.  An  example  of  the  fused 
image  produced  by  the  gradient  pyramid  method  is  shown 
in  Fig.  6(a).  These  methods  are  designed  to  preserve  orien¬ 
tation  information,  which  can  be  useful  in  some  applications. 

Morphological  operations,  such  as  opening  and  closing, 
can  be  applied  to  the  Gaussian  pyramid  without  harmful 
effects  under  certain  circumstances  and  result  in  a  morpho¬ 
logical  pyramid.17  For  example,  we  can  apply  the  following 


operations  to  compute  the  next  set  of  coefficients  from 
morphologically  open  (lm)k  by  first  replacing  the  value  of  a 
given  pixel  with  the  smallest  pixel  value  found  within  a  pre¬ 
defined  neighborhood  of  that  pixel  (erosion),  and  then  on  the 
resulting  image,  replacing  the  value  of  a  given  pixel  with  the 
largest  pixel  value  found  in  the  same  neighborhood  (dila¬ 
tion).  The  resulting  image  can  then  be  closed  by  reversing 
the  process — namely,  first  performing  a  dilation  and  then 
an  erosion  operation.  The  opening  operation  will  remove 
small  objects,  while  the  closing  operation  will  remove 
noise  and  smooth  transitions.  We  decimate  the  resulting 
image  to  obtain  our  image  for  the  next  level  of  the  pyramid, 
(ImWi-  We  obtain  the  pyramid  coefficients  of  level  k  +  1  as 
the  difference  between  (Im)k  and  an  up-sampled  and  dilated 
version  of  (Im)k+l  While  these  morphological  operations 
may  produce  good-looking  results,  as  shown  in  Fig.  6(b), 
they  are  quite  computationally  intensive  in  nature,  and 
their  usefulness  in  enhancing  tracking  performance  is  not 
necessarily  great. 

Finally,  many  specialized  pyramid  decompositions,  such 
as  contourlets  and  wavelets,  separate  an  image  into  approx¬ 
imations  and  detail.  We  examined  a  simple  discrete  wavelet 
transform  (DWT)  using  the  Daubechies  Symmetric  Spline 


Fig.  6  Fusion  by  selecting  the  maximum  coefficient  of  the  gradient  pyramids  (a)  and  morphological  pyramids  (b). 
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wavelet,  as  well  as  a  shift-invariant  discrete  wavelet  trans¬ 
form  (SIDWT),  using  the  Harr  wavelet.  The  DWT  is  applied 
to  an  input  image  using  two  filters,  gx  =  [-2  4  -2]  and 

hi  =  [—  1  2  6  2  -1].  In  this  case,  gx  is  a  high-pass 

filter  and  hx  a  low-pass  filter.  These  filters  are  applied  to 
the  columns  and  rows  of  an  image  consecutively  in  one 
of  these  four  combinations:  g[*  gx,  g[  *  hx,  h[*  gx,  and 
h[  *  hx.  The  output  of  g[  *  gx  is  the  high  frequency  content 
of  the  image,  while  the  output  of  h[  *  hx  contains  only  the 
low-pass  one.  All  four  combinations  of  the  outputs  are  then 
decimated  by  two  to  form  four  subband  images.  The  result¬ 
ing  low-pass  image  is  used  for  the  next  iteration  of  decom¬ 
position,  while  the  maximum  coefficients  from  the  other 
three  sets  are  stored  in  the  wavelet  tree.  An  example  of 
the  fused  images  produced  by  DWT  pyramids  is  shown  in 
Fig.  7(a). 

For  SIDWT,  the  filters  gx  and  h\  are  defined  as  gx  — 
[0  ...  0  0.5  0  ...  0  -0.5  0  ...  0]  and 

hx  =  [0  ...  0  0.5  0  ...  0  0.5  0  ...  0],  with 

2 (£-2)  zer0es  in  the  first  and  last  set  of  zeroes,  and  2^_1) 
zeroes  in  the  middle  set  of  zeroes  for  level  k  of  the  pyramid. 
While  the  SIDWT  is  very  redundant  (because  it  up-samples 
the  filter  response  instead  of  decimating  the  image  at  each 
level  of  the  pyramid),  the  dual- tree  complex  wavelet 


transform  (DT-CWT)  can  achieve  approximate  shift  invari¬ 
ance  and  only  slight  oversampling  by  filtering  the  image  with 
a  pair  of  complementary  filters.  DT-CWT  produces  real  and 
complex  coefficients  at  each  level  of  the  decomposition  for  a 
total  of  2d  oversampling,  where  d  is  the  number  of  levels  of 
decomposition.  Figure  7  shows  an  example  of  the  fused 
images  produced  by  SIDWT  pyramids  [Fig.  7(b)]  and 
DT-CWT  [Fig.  7(c)],  respectively. 

The  simple  DWT  can  be  prone  to  artifacts  as  a  function  of 
position  in  the  image,  which  could  be  particularly  problem¬ 
atic  when  using  the  FPSS  background  subtraction  tracker  to 
detect  motion  information.  As  an  object  moves  slightly,  arti¬ 
facts  could  shift  in  the  image,  resulting  in  many  unnecessary 
false  alarms.  Hence,  a  SIDWT  or  DT-CWT  is  expected  to 
perform  better  in  a  tracking  task.  Similar  to  other  pyramid 
methods,  we  use  the  maximum  coefficient  from  either  wave¬ 
let  tree  at  each  level  during  the  image  fusion  phase. 

3  FPSS  Tracker 

The  effects  of  different  image  fusion  methods  were  examined 
and  compared  using  the  existing  FPSS  moving  target 
tracking  algorithm  that  was  developed  and  tested  with  the 
original  FPSS  datasets.  In  this  study,  the  FPSS  tracker 
was  run  on  the  original  color  and  LWIR  images,  as  well 


(C) 

Fig.  7  Fusion  by  selecting  the  maximum  coefficient  of  DWT  pyramids  (a),  SIDWT  pyramids  (b),  and  DT-CWT  pyramids  (c). 
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as  the  fused  images  generated  by  all  fusion  methods 
described  in  the  previous  section. 

3.1  Background  Modeling 

The  key  component  of  the  FPSS  tracker  is  its  background 
modeling  and  subtraction  process,  which  is  depicted  in 
Fig.  8.  Each  input  image  is  first  filtered  by  a  stability 
mask  and  then  channeled  through  four  image  buffers  of 
equal  size  and  depth.  The  images  in  buffers  2  and  4  are 
used  to  generate  background  models  1  and  2,  respectively. 
Instead  of  being  created  originally  from  buffer  4,  background 
model  2  can  also  be  obtained  from  a  buffer  of  models  that  is 
continuously  replenished  by  the  outgoing  representations  of 
background  model  1.  By  subtracting  the  next  input  frame 
from  these  background  models,  we  obtain  two  difference 
images.  A  difference-product  image  (DPI)  is  obtained  by 
multiplying  these  two  difference  images  pixel  by  pixel. 

To  begin  the  background  modeling  process,  the  first  suc¬ 
cessfully  preprocessed  input  image  frame  is  used  to  fill  up  all 
image  buffers  and  to  become  the  initial  background  models. 
For  each  of  the  subsequent  input  image  frames,  a  simple 
frame  registration  procedure  is  used  to  reduce  any  potential 
jitter  effects  incurred  by  shaking  cameras.  Typically,  a  jitter- 
free  image  contains  a  mostly  stable  background  with  a  num¬ 
ber  of  small  but  volatile  areas  caused  by  moving  objects  and 
other  transient  events.  In  order  to  prevent  rapidly  changing 
foreground  pixels  from  ruining  the  background  models,  a 
stability  mask  is  used  to  filter  out  all  unstable  pixels  from 
the  input  image  frame.  Updated  by  the  information  from 
DPI,  this  stability  mask  looks  for  significant  intensity 
changes  based  on  a  predefined  threshold  of  variability  and 
maintains  a  record  of  the  stability  index  at  each  pixel  loca¬ 
tion.  Only  those  stable  pixels  on  a  jitter-free  image  are  fed  to 
buffer  1,  while  the  once-stable  but  now  actively  changing 
pixels  are  blocked  and  substituted  by  the  corresponding  sta¬ 
ble  pixels  available  from  buffer  1 .  Without  the  stable  back¬ 
ground  models,  it  will  be  much  harder  to  detect  and  extract 
legitimate  moving  objects  in  the  scene,  while  additional  false 
alarms  will  likely  be  generated. 

Each  incoming  set  of  pixel  values  from  the  stability  mask 
replaces  the  corresponding  pixel  values  in  the  oldest  frame  in 
buffer  1  to  form  the  newest  frame  in  buffer  1,  while  the  oldest 
frame  of  buffer  1  becomes  the  newest  frame  in  buffer  2.  The 
same  mechanism  of  first-in  first-out  (FIFO)  frame-shift  and 
update  is  applied  to  all  image  buffers  continuously.  The  role 
of  buffer  1  is  merely  a  time-delay  buffer  to  induce  a  notice¬ 
able  gap  in  time — and  potentially  in  content — between  the 
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Fig.  8  The  background  modeling  and  subtraction  process  in  the 
FPSS  tracker. 


current  input  image  and  the  image  frames  in  buffer  2. 
Background  model  1  is  derived  from  the  images  in  buffer  2, 
which  can  be  as  simple  as  taking  the  average  of  all  images  in 
buffer  2.  Similar  to  buffer  1,  buffer  3  is  just  another  buffer  to 
separate  buffer  2  and  buffer  4  in  time.  Background  model  2 
can  be  obtained  by  either  processing  (e.g.,  averaging)  the 
images  in  buffer  4  or  drawing  from  the  buffer  of  models  sup¬ 
plied  by  background  model  1.  The  same  background  mod¬ 
eling  structure  depicted  in  Fig.  8  can  be  extended  to  include 
four  or  any  larger  even  number  of  background  models  for 
more  stable  background  representations  and  higher  target 
enhancement  capabilities  at  the  expense  of  additional  com¬ 
putational  resources  and  a  longer  initialization  period  before 
the  actual  tracking  begins. 

One  of  the  advantages  of  using  multiple  disjoint  back¬ 
ground  models  to  generate  a  DPI  is  that  the  problematic 
“trailing  effect,”  which  is  often  associated  with  background 
subtraction  method,  can  be  suppressed  effectively  in  this 
process.  Since  those  gradually  fading  trails  carved  out  by  the 
moving  objects  are  showing  up  in  different  parts  of  the  dis¬ 
joint  difference  images,  as  shown  by  the  two  difference 
images  on  the  left  side  of  Fig.  9,  they  are  likely  to  diminish 
or  disappear  when  the  corresponding  DPI  is  formed.  For  the 
same  reason,  time-dependent  noises  on  the  difference  images 
are  also  suppressed  during  the  formation  of  DPI.  Another 
advantage  of  this  method  is  that  the  target  trails  are  now 
clearly  detached  from  the  moving  objects,  which  allows  the 
subsequent  target  detection  module  to  estimate  the  size  and 
location  of  those  movers  more  accurately.  With  improved 
estimation  in  target  size  and  location,  the  target  tracking 
module  may  also  perform  better  motion  estimation  and  track 
maintenance. 

An  even  number  of  background  models  is  needed  in  the 
formation  of  DPI  to  address  the  problem  of  target  polarity, 
which  is  a  common  target  detection  problem.  Due  to  clothing 
and  ambient  temperature  change,  the  same  type  of  moving 
targets  may  assume  different  polarity  of  pixel  intensity  with 
respect  to  their  immediate  background.  Figure  10  shows  a 
pair  of  LWIR  images  that  exhibit  polarity  change  in  human 
signatures  during  different  seasons  of  the  year.  Using  a  single 
difference  image  or  a  DPI  computed  with  any  odd  number 
of  difference  images  to  detect  the  moving  targets  will  have 
to  pick  the  locations  with  both  positive  and  negative  values 
simultaneously  and  appropriately,  which  is  not  always  easy 
or  straightforward.  This  problem  is  alleviated,  however,  sim¬ 
ply  as  a  by-product  of  forming  the  DPI  using  an  even  number 
of  difference  images. 
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Fig.  9  Enhancement  of  target  signatures  and  suppression  of  trailing 
effects  and  noises  via  a  DPI. 
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(a) 


(b) 


Fig.  10  Human  LWIR  signatures  reverse  polarity  in  winter  (a)  and  summer  (b). 


3.2  Target  Detection  and  Tracking 

After  a  DPI  is  generated,  a  morphological  operation  is  used 
to  remove  small  spikes  and  to  fill  up  small  gaps  in  the  DPI. 
Furthermore,  a  pyramid-means  method  is  used  to  enhance 
the  centroid  and  overall  silhouette  of  the  moving  targets. 
The  moving  target  detection  process  begins  with  finding 
the  brightest  pixel  on  the  post-processed  DPI,  which  is  usu¬ 
ally  associated  with  the  most  probable  moving  target  in  the 
given  input  frame.  The  size  of  this  target  is  estimated  by  find¬ 
ing  all  the  surrounding  pixels  that  are  deemed  connected  to 
the  brightest  pixel.  After  the  first  moving  target  is  detected, 
all  the  pixels  within  a  rectangular  target-sized  area  of  that 
target  are  suppressed  to  exclude  them  from  subsequent  detec¬ 
tions.  The  detection  process  is  repeated  by  finding  the  next 
brightest  one  among  the  remaining  pixels  until  all  the  pixels 
are  suppressed,  a  predefined  number  of  detections  are  ob¬ 
tained,  or  other  user-defined  stopping  criteria  are  reached. 

Using  the  detection  results  on  consecutive  input  images, 
tracks  of  all  moving  targets  are  built  and  maintained.  In  order 
to  build  a  meaningful  track,  a  noticeable  moving  target  must 
appear  in  multiple  contiguous  frames  in  a  video  sequence. 
This  requirement  may  not  be  met  when  the  target  is  moving 
across  the  field  of  view  of  the  camera  at  a  very  short  range 
and/or  a  very  high  speed;  when  the  camera  is  operated  at  a 
very  low  frame  rate;  when  the  target  is  occluded  for  an 
extended  period  of  time  and/or  behind  a  very  large  obstacle; 
or  when  a  combination  of  these  and  other  detrimental  factors 
occur.  The  FPSS  tracker  also  uses  previous  locations,  veloc¬ 
ity,  and  target  size  of  a  moving  target  to  predict  the  destina¬ 
tion  of  its  next  movement. 

4  Experimental  Results 

The  Second  FPSS  dataset  consists  of  53  pairs  of  concurrent 
color-LWIR  video  sequences  for  a  total  of  71,236  frames, 
which  depict  various  staged  suspicious  activities  around  a 
big  parking  lot.  Each  video  sequence  was  obtained  at  a  frame 
rate  of  10  frames  per  second.  No  frames  were  dropped  from 
any  sequence  in  our  experiments,  therefore,  the  same  frame 
rate  was  maintained  across  the  board.  Ground-truth  infor¬ 
mation  (target  type  and  target  location)  associated  with  each 
observable  moving  target  on  each  image  frame  was  semi- 
manually  generated  using  a  ground-truthing  GUI,  storing 


the  location  of  all  people,  vehicles,  animals,  and  other 
objects  for  each  frame.  The  ground  truth  files  associated 
with  a  given  pair  of  color-LWIR  sequences  may  vary  slightly 
in  their  content,  as  some  moving  targets  may  sometimes  be 
observable  in  one  but  not  both  of  the  imageries.  Because  we 
used  the  LWIR  approximation  coefficients  during  the  pyra¬ 
mid  decompositions,  and  because  LWIR  ground  truth  files 
usually  contain  more  information  on  the  targets,  we  chose 
the  LWIR  ground-truths  files  for  the  purpose  of  verifying 
the  detections  on  fused  images.  Based  on  the  ground-truth 
information  and  the  target  size  estimated  by  the  FPSS 
tracker,  we  may  compute  the  tracking  performance  achiev¬ 
able  by  the  original  color  and  LWIR  sequences,  as  well  as 
the  performances  pertaining  to  the  fused  image  sequences 
generated  by  different  image  fusion  methods. 

To  qualify  as  a  correct  detection  or  a  hit,  the  ground-truth 
location  must  be  included  in  the  bounding  box  (target  size) 
estimated  by  the  FPSS  tracker  for  the  given  detection. 
Multiple  detections  on  the  same  target  were  counted  as  only 
one  hit,  but  multiple  detections  on  a  nontarget  were  treated  as 
multiple  false  alarms  (FAs).  When  multiple  targets  in  prox¬ 
imity  were  covered  by  a  single  detection,  it  would  be  treated 
as  multiple  hits.  Ground-truth  targets  that  were  not  included 
by  the  bounding  box  of  any  detection  were  regarded  as 
misses.  An  adjustable  acceptance  threshold  was  used  to 
vary  the  tradeoff  between  hits  and  FAs.  While  a  range  of 
acceptance  thresholds  from  0.1  to  25,000  was  initially  con¬ 
sidered,  we  actually  used  the  acceptance  thresholds  from  30 
to  25,000  because  very  few  FPSS  responses  had  an  activation 
level  of  under  30  in  our  experiments.  Instead  of  normalizing 
the  FPSS  responses  from  different  fusion  methods  and  com¬ 
paring  their  performance  at  different  acceptance  thresholds, 
we  just  compared  their  hit  rates  at  certain  fixed  FA  rates. 
By  plotting  the  FA  rate  (FAR)  (average  number  of  incorrect 
detections  per  frame)  against  the  hit  rate  (percentage  of  true 
targets  that  were  correctly  detected)  at  different  acceptance 
thresholds,  a  receiver  operating  characteristic  (ROC)  curve 
results.  To  emphasize  the  critical  differences  between  the 
ROC  curves,  we  focused  on  the  two  end  zones  of  these 
curves  and  examined  the  performance  at  FAR  of  less  than 
0.1  FA/frame  and  at  hit  rates  exceeding  80%.  The  ROC 
curves  for  the  original  LWIR  and  color  sequences  were  first 
generated,  as  shown  in  Fig.  11,  serving  as  the  benchmark 
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(b) 


Fig.  11  The  performance  of  four  simple-combination  methods  at  low  FAR  region  (a)  and  high  hit  rate  region  (b). 


performance  curves  that  are  included  in  all  performance- 
related  figures  for  comparison  purposes. 

Figure  1 1  also  shows  the  ROC  curves  associated  with  the 
fused  images  generated  by  the  four  simple-combination 
methods:  simple  averaging,  PC  A- weighted  averaging,  maxi¬ 
mum  pixel  selection,  and  minimum  pixel  selection.  Their 
performances  at  the  low  FAR  region  are  shown  on  the  left 
graph,  while  the  right  graph  shows  their  performance  as 
more  FAs  are  allowed.  From  the  left  graph,  it  is  clear  that  the 
original  LWIR  images  performed  the  best  at  low  FAR  among 
this  group  of  six  candidates.  On  the  other  hand,  the  original 
color  images  were  lagging  behind  their  LWIR  counterparts 
consistently  due  to  a  significant  increase  in  the  number  of 
FAs  caused  by  headlight  glares  and  windshield  reflections  in 
the  evening  hours,  and  protracted  shadows  under  the  slanted 
sun.  The  right  graph  shows  that  the  advantage  of  LWIR 
sequences  over  color  sequences  continues  to  hold  at  the 
high  FAR  region. 

Given  the  nature  of  simply  averaging  or  selecting  the  pix¬ 
els  of  the  original  color  and  LWIR  images  by  the  four  simple- 
combination  methods,  we  expected  that  their  resulting  fused 
images  would  perform  somewhere  between  the  original 
color  and  LWIR  images.  As  evident  from  Fig.  11(a),  this 
was,  indeed,  the  case  for  the  FAR  region  of  0.02  or  less 


FAs  per  frame.  As  the  allowable  number  of  FAs  was 
increased,  the  fused  images  produced  by  simple  averaging 
and  maximum  pixel  selection  methods  continued  to  yield 
hit  rates  that  were  between  those  produced  by  the  original 
color  and  LWIR  images,  as  demonstrated  in  Fig.  11(b). 
The  performance  associated  with  the  fused  images  generated 
by  the  PCA-weighted  averaging  and  minimum  pixel  selec¬ 
tion  methods,  however,  gradually  fell  below  the  performance 
of  the  original  color  images.  In  other  words,  there  was  no 
performance  gain  in  tracking  at  any  FAR  by  using  the  images 
fused  with  simple  combination  methods  over  the  original 
LWIR  images.  At  FARs  higher  than  0.02  FA  per  frame, 
even  the  original  color  images  outperformed  the  fused 
images  produced  by  the  PCA-weighted  averaging  and 
minimum  pixel  methods. 

The  fusion  methods  based  on  pyramid  structures  were 
performed  using  an  identical  set  of  configuration  parameters, 
which  is  using  five  levels  of  decomposition  and  a  7  X  7 
neighborhood  size  when  running  a  saliency/match  measure. 
Based  on  their  resulting  ROC  curves,  these  pyramid-based 
fusion  methods  were  categorized  into  two  groups  for  sub¬ 
sequent  discussions:  four  inferior  methods  and  five  superior 
methods.  As  shown  in  Table  1,  all  nine  pyramid-based  meth¬ 
ods  are  much  more  computationally  intensive  than  the  four 


Table  1  CPU  time  (sec)  needed  to  fuse  30  images  using  Matlab  code  on  a  Dell  T7400  workstation. 


Simple  combinations 

CPU  time 

Inferior  pyramids 

CPU  time 

Superior  pyramids 

CPU  time 

Simple  average 

1.280 

FSD 

21.670 

Laplacian 

24.040 

PCA  average 

2.030 

Gradient 

78.970 

ROLP 

23.050 

Maximum  pixel 

1.560 

DWT 

22.740 

Contrast 

23.240 

Minimum  pixel 

1.840 

Morphological 

62.530 

SI  DWT 

209.600 

DT-CWT 

49.940 
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Fig.  12  The  performance  of  four  inferior  pyramid-based  fusion  methods  at  low  FAR  region  (a)  and  high  hit  rate  region  (b). 


simple  combination  methods,  especially  the  SIDWT,  gra¬ 
dient,  and  morphological  pyramids.  Although  the  DT-CWT 
is  more  than  four  times  more  efficient  than  its  more  redun¬ 
dant  variant,  SIDWT,  it  is  still  considerably  slower  than  the 
five  simpler  pyramid-based  methods,  three  of  which  are 
ranked  together  in  the  superior  pyramid  column.  More  com¬ 
putations  do  not  always  generate  better  results,  and  as  we  can 
see,  among  the  pyramid-based  methods  there  are  faster  and 
slower  candidates  in  both  the  inferior  and  superior  columns 
of  Tablet. 

As  shown  by  Fig.  12(a),  the  FSD,  gradient,  and  DWT 
achieved  slightly  worse  performance  than  the  original  LWIR 
images  at  low  FARs,  whereas  the  morphological  pyramid 
method  clearly  lagged  behind  others  under  the  same  condi¬ 
tions.  The  picture  is  somewhat  different  at  the  other  end  of 
these  ROC  curves,  as  shown  by  Fig.  12(b),  where  the  DWT 


(a) 


and  morphological  pyramid  methods  were  able  to  surpass  the 
LWIR  curve  at  the  FAR  region  of  0.7  FA  per  frame  or  higher. 
Since  alternative  pyramid-based  methods  offer  more  consis¬ 
tent  gains  over  the  complete  range  of  FAR,  we  deem  these 
four  pyramid-based  methods  inferior. 

Finally,  there  are  five  pyramid-based  fusion  methods  that 
have  achieved  good  results  on  both  ends  of  the  ROC  curves: 
the  Laplacian,  ROLP,  contrast,  SIDWT,  and  DT-CWT  pyra¬ 
mid  methods.  As  shown  by  Fig.  13(a),  these  five  fusion 
methods  clearly  outperformed  the  original  color  and  LWIR 
images  from  the  beginning  and  attained  the  largest  advantage 
at  the  FAR  of  around  0.02  FA  per  frame.  At  this  FAR,  the 
hit  rates  for  the  original  color  and  LWIR  images  are  54.29% 
and  62.99%,  respectively.  As  shown  in  Table  2,  the  corre¬ 
sponding  hit  rates  of  the  images  fused  by  contrast  pyramid 
and  ROLP  pyramid  methods  are  76.94%  and  75.11%, 
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(b) 


Fig.  13  The  performance  of  five  superior  pyramid-based  fusion  methods  at  low  FAR  region  (a)  and  high  hit  rate  region  (b). 
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Table  2  Performance  (hit  rate  in  %/FA  per  frame)  of  the  13  fusion  methods  at  low  FAR  region. 


Simple  combinations 

HR/FAR 

Inferior  pyramids 

HR/FAR 

Superior  pyramids 

HR/FAR 

Simple  average 

56.14/0.02005 

FSD 

60.34/0.02008 

Laplacian 

73.51  /0. 02008 

PCA  average 

53.43/0.02008 

Gradient 

62.90/0.02005 

ROLP 

75.11/0.02005 

Maximum  pixel 

55.71  /0. 02008 

DWT 

61.47/0.02008 

Contrast 

76.94/0.02005 

Minimum  pixel 

56.17/0.02008 

Morphological 

40.27/0.02008 

SIDWT 

67.53/0.02005 

DT-CWT 

73.80/0.02002 

respectively.  With  improvements  of  12  to  14%  over  the 
LWIR  images,  the  performance  gains  achieved  by  these 
two  fusion  methods  are  quite  remarkable  at  this  FAR. 

Among  the  five  superior  pyramid-based  methods,  SIDWT 
is  clearly  lagging  behind  other  methods  in  performance. 
Furthermore,  the  computational  complexity  of  SIDWT  is 
about  nine  times  of  that  of  the  contrast  pyramid  and  ROLP 
pyramid  methods.  Therefore,  SIDWT  is  the  least  desirable 
method  among  this  group.  Although  the  performance  of 
DT-CWT  is  competitive  to  those  of  the  contrast,  ROLP, 
and  Laplacian  methods,  it  requires  more  than  twice  as 
much  CPU  time  to  complete  the  same  image  fusion  task, 
which  makes  it  less  attractive  among  the  group  of  superior 
pyramids. 

Based  on  the  performance  of  the  superior  fusion  methods 
at  the  high  FAR  region  shown  in  Fig.  13(b),  it  is  obvious  that 
the  advantage  of  these  methods  over  the  original  color  and 
LWIR  images  is  still  maintained  at  every  point  in  the  high 
FAR  region,  even  though  the  performance  gain  is  less  sig¬ 
nificant  than  that  in  low  FAR  region.  For  example,  the  hit 
rates  of  the  images  fused  by  contrast  pyramid  and  ROLP 
pyramid  methods  at  a  FAR  of  0.80  FA  per  frame  are 
95.52%  and  95.37%,  respectively,  exceeding  those  of  color 
(93.66%)  and  LWIR  (94.31%)  images  by  slightly  more 
than  1%. 

As  an  object  moves,  it  appears  as  a  constant  shift  of  some 
pixels  in  the  image.  For  a  method  such  as  the  regular  DWT, 
large  coefficient  variations  can  be  incurred  by  this  shift.  If 
it  occurs  in  either  the  LWIR  or  color  image,  these  large 
image  coefficients  can  significantly  outweigh  the  expected 
ones  during  the  image  fusion  process.  Instead  of  appearing 
as  slight  movements,  some  random  variations  may  appear 
instead,  which  are  more  likely  to  be  interpreted  as  noise. 
Due  to  the  shift  invariance  property  of  DT-CWT,  this  trans¬ 
form  is  less  affected  by  this  problem  and  hence  performs 
among  the  best.  SIDWT  should  have  offered  similar  advan¬ 
tages  as  the  DT-CWT  does,  but  the  oversampling  during  the 
decomposition  process  actually  produces  some  conflicts 
when  the  color  and  LWIR  images  are  merged,  thus  hampers 
the  SIDWT  performance  somewhat.  The  ROLP,  contrast, 
and  Laplacian  methods  work  exceedingly  well,  because  they 
all  emphasize  the  contrast  (brightness  variation  in  color  or 
LWIR  images)  information  in  the  input  images.  On  the 
other  hand,  gradient  and  morphological  methods  that  empha¬ 
size  the  edge  information  in  the  input  images  do  not  perform 
very  well  because  many  faint  targets  may  not  have  strong 
edges. 


5  Conclusions 

Although  a  given  sensor  may  be  easily  fooled  sometimes,  it 
is  much  harder  to  trick  a  number  of  sensors  simultaneously  at 
any  given  time.  For  this  reason,  we  explored  and  exploited 
the  rather  complementary  natures  of  two  common  imaging 
sensors:  LWIR  and  color  visible  sensors.  Instead  of  harness¬ 
ing  prior  background  knowledge  and  external  information 
sources  (such  as  metadata  on  weather  conditions,  time  of 
the  day,  season  of  the  year,  site  characteristics,  number  of 
targets,  target  ranges,  depression  angle,  speed  of  movement, 
and  other  related  information)  to  perform  symbolic-level 
image  fusion,  we  focused  solely  on  pixel-level  image  fusion 
in  this  work.  Therefore,  the  techniques  examined  and  the 
results  obtained  in  this  work  are  more  readily  transferable 
to  other  applications  and  scenarios  that  process  color  and 
LWIR  imageries. 

Based  on  the  results  generated  by  the  four  simple-combi- 
nation  methods  examined  in  this  work,  we  conclude  that 
these  simple  methods  are  not  useful,  because  their  perfor¬ 
mances  were  worse  than  using  the  original  LWIR  images 
alone.  Among  the  nine  pyramid-based  image  fusion  meth¬ 
ods,  the  gradient  and  FSD  methods  are  the  worst  candidates 
because  they  required  10  to  60  times  more  CPU  time  than 
those  required  by  the  simple  combination  methods,  but 
performed  even  worse  at  the  high  FAR  region.  The  morpho¬ 
logical  and  DWT  methods  are  slightly  better  than  the  gra¬ 
dient  and  FSD  methods,  primarily  because  they  managed 
to  outperform  LWIR  in  the  high  FAR  region.  Given  their 
performances  and  computational  requirements,  these  four 
pyramid-based  methods  are  deemed  as  inferior  methods  in 
general. 

The  Laplacian,  ROLP,  contrast,  SIDWT,  and  DT-CWT  are 
found  to  be  superior  image  fusion  methods,  because  they 
consistently  outperformed  LWIR  in  every  FAR  region.  The 
contrast  and  ROLP  methods  are  considered  the  best  image 
fusion  methods  to  pair  with  the  FPSS  tracker  because  their 
ROC  curves  are  consistently  on  top  of  all  other  ROC  curves 
produced  in  this  work.  Furthermore,  the  computational  re¬ 
quirements  of  these  two  methods  are  almost  the  lowest 
among  the  pyramid-based  methods.  On  the  other  hand, 
SIDWT  is  ranked  at  the  bottom  in  this  group,  as  it  performed 
the  worst  and  consumed  four  to  nine  times  more  CPU  time 
than  its  counterparts  in  this  group  did. 

For  future  work,  a  potential  way  of  improving  image 
fusion  performance  is  to  treat  each  color  image  as  three  sep¬ 
arate  images  (R,  G,  and  B  images)  and  fuse  these  three 
images  with  the  LWIR  image  together.  The  fusion  algorithms 
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examined  in  this  work  do  not  limit  the  number  of  images  that 
can  be  fused  together.  Therefore,  short-wave  infrared,  mid¬ 
wave  infrared,  and  hyperspectral  imageries  could  also  be 
considered,  if  they  are  properly  coregistered.  Performance 
may  also  be  improved  by  linking  the  image  fusion  process 
with  the  tracking  algorithm,  through  which  the  information 
that  is  critical  to  the  tracker  may  be  better  preserved  or 
enhanced.  For  instance,  a  region-based  segmentation  algo¬ 
rithm  may  be  incorporated  into  the  DT-CWT  image  fusion 
process.18,19  The  segmentation  algorithm  could  exploit  the 
limited  redundancy  in  DT-CWT  and  tie  the  feature  level 
and  pixel  level  fusion  algorithms  together. 
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Abstract — In  this  paper,  we  propose  a  new  sparsity-based  al¬ 
gorithm  for  automatic  target  detection  in  hyperspectral  imagery 
(HSI).  This  algorithm  is  based  on  the  concept  that  a  pixel  in  HSI 
lies  in  a  low-dimensional  subspace  and  thus  can  be  represented 
as  a  sparse  linear  combination  of  the  training  samples.  The 
sparse  representation  (a  sparse  vector  corresponding  to  the  linear 
combination  of  a  few  selected  training  samples)  of  a  test  sample 
can  be  recovered  by  solving  an  €0-norm  minimization  problem. 
With  the  recent  development  of  the  compressed  sensing  theory, 
such  minimization  problem  can  be  recast  as  a  standard  linear 
programming  problem  or  efficiently  approximated  by  greedy 
pursuit  algorithms.  Once  the  sparse  vector  is  obtained,  the  class 
of  the  test  sample  can  be  determined  by  the  characteristics  of  the 
sparse  vector  on  reconstruction.  In  addition  to  the  constraints 
on  sparsity  and  reconstruction  accuracy,  we  also  exploit  the  fact 
that  in  HSI  the  neighboring  pixels  have  a  similar  spectral  char¬ 
acteristic  (smoothness).  In  our  proposed  algorithm,  a  smoothness 
constraint  is  also  imposed  by  forcing  the  vector  Laplacian  at 
each  reconstructed  pixel  to  be  minimum  all  the  time  within  the 
minimization  process.  The  proposed  sparsity-based  algorithm 
is  applied  to  several  hyperspectral  imagery  to  detect  targets  of 
interest.  Simulation  results  show  that  our  algorithm  outperforms 
the  classical  hyperspectral  target  detection  algorithms,  such  as 
the  popular  spectral  matched  filters,  matched  subspace  detectors, 
adaptive  subspace  detectors,  as  well  as  binary  classifiers  such  as 
support  vector  machines. 

Index  Terms — Hyperspectral  imagery,  sparse  recovery,  sparse 
representation,  spatial  correlation,  target  detection. 


I.  Introduction 

HYPERSPECTRAL  remote  sensors  capture  digital  images 
in  hundreds  of  narrow  spectral  bands  (about  10  nm  wide), 
which  span  the  visible  to  infrared  spectrum  [1].  Pixels  in  HSI  are 
represented  by  B -dimensional  vectors  where  B  is  the  number 
of  spectral  bands.  Different  materials  are  usually  assumed  to  be 
spectrally  separable  as  they  reflect  electromagnetic  energy  dif¬ 
ferently  at  specific  wavelengths.  This  property  enables  discrim¬ 
ination  of  materials  based  on  the  radiance  spectrum  obtained 
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by  hyperspectral  imagery.  HSI  has  found  many  applications  in 
various  fields  such  as  military  [2]-[4],  agriculture  [5],  [6],  and 
mineralogy  [7].  One  of  the  important  applications  of  HSI  is 
target  detection,  which  can  be  viewed  as  a  two-class  classifica¬ 
tion  problem  where  pixels  are  labeled  as  target  (target  present) 
or  background  (target  absent)  based  on  their  spectral  character¬ 
istics.  Support  vector  machines  [8],  [9]  have  been  a  powerful 
tool  to  solve  supervised  classification  problems  and  have  shown 
a  good  classification  performance  for  hyperspectral  classifica¬ 
tion  [10],  [1 1].  A  number  of  algorithms  also  have  been  proposed 
for  target  detection  in  HSI  based  on  statistical  hypothesis  testing 
techniques  [2].  Among  these  approaches,  spectral  matched  fil¬ 
ters  [12],  [13],  matched  subspace  detectors  [14],  and  adaptive 
subspace  detectors  [15]  have  been  widely  used  to  detect  targets 
of  interests.  The  details  of  these  classical  algorithms  will  be  de¬ 
scribed  in  the  next  section. 

Recently,  a  novel  signal  classification  technique  via  sparse 
representation  have  been  proposed  for  face  recognition  [16].  It 
is  observed  that  aligned  faces  of  the  same  object  with  varying 
lighting  conditions  approximately  lie  in  a  low-dimensional  sub¬ 
space  [17].  Thus,  a  test  face  image  can  be  sparsely  represented 
by  training  samples  from  all  classes.  The  most  compact  rep¬ 
resentation  can  be  obtained  by  solving  a  sparsity-constrained 
optimization  problem.  This  algorithm  exploits  the  discrimina¬ 
tive  nature  of  sparse  representation  and  the  reconstruction  of  the 
test  sample  provides  directly  its  classification  label.  This  idea 
naturally  extends  to  other  signal  classification  problems  such 
as  iris  recognition  [18],  tumor  classification  [19],  and  HSI  un¬ 
mixing  [20]. 

In  this  paper,  we  propose  a  target  detection  algorithm  based 
on  sparse  representation  for  HSI  data.  We  use  the  same  sparsity 
model  in  [16]  where  a  test  sample  is  approximately  represented 
by  very  few  training  samples  from  both  target  and  background 
dictionaries,  and  the  recovered  sparse  representation  is  used  di¬ 
rectly  for  detection.  In  addition  to  the  constraints  on  sparsity 
and  reconstruction  accuracy,  we  show  that  it  is  necessary  to  ex¬ 
ploit  the  fact  that  neighboring  HSI  pixels  usually  have  a  sim¬ 
ilar  spectral  characteristics  as  well.  To  achieve  this,  we  impose 
a  smoothing  constraint  on  the  reconstructed  image  by  forcing 
the  vector  Laplacian,  as  defined  in  Section  III-D,  of  the  recon¬ 
structed  pixels  to  be  zero.  By  incorporating  this  spatial  corre¬ 
lation,  the  detection  performance  is  significantly  improved  for 
images  in  which  targets  consist  of  multiple  pixels. 

One  of  the  advantages  of  our  proposed  approach  is  that  there 
is  no  explicit  assumption  on  the  statistical  distribution  character¬ 
istics  of  the  observed  data  as  in  the  previous  target  detection  al¬ 
gorithms  [12]— [15].  Furthermore,  in  the  spectral  matched  filter, 
the  target  spectral  signature  is  a  single  vector,  usually  obtained 
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by  averaging  the  training  target  samples  or  from  a  spectral  li¬ 
brary.  However,  using  a  single  target  spectrum  is  usually  insuf¬ 
ficient  to  represent  the  target  spectral  characteristics  since  the 
target  spectrum  changes  with  the  environmental  situation.  This 
problem  can  be  avoided  by  using  a  target  subspace  model  rep¬ 
resented  by  training  samples  that  account  for  the  target  spec¬ 
trum  under  various  conditions  of  illumination  and  atmospheric 
conditions,  making  the  dictionary  invariant  to  the  environmental 
variations  [21],  [22].  This  environmental  invariant  approach  can 
easily  be  incorporated  into  our  algorithm  by  augmenting  the 
target  and  background  dictionaries  with  synthetically  generated 
spectral  signatures  in  order  to  construct  better  target  and  back¬ 
ground  subspaces.  Moreover,  unlike  the  other  detectors  based 
on  statistical  hypothesis  testing,  the  sparsity  model  in  our  ap¬ 
proach  has  the  flexibility  of  imposing  additional  restrictions  cor¬ 
responding  to  the  characteristics  of  HSI  such  as  smoothness 
across  neighboring  hyperspectral  pixels. 

The  paper  is  structured  as  follows.  Section  II  briefly  de¬ 
scribes  several  previously  proposed  approaches  commonly 
used  in  automatic  target  detection  in  HSI.  Our  sparsity-driven 
target  detection  algorithm  is  presented  in  Section  III.  The 
effectiveness  of  the  proposed  method  is  demonstrated  by  sim¬ 
ulation  results  presented  in  Section  IV.  Conclusions  are  drawn 
in  Section  V.  Throughout  this  paper,  matrices  and  vectors  are 
denoted  by  upper  and  lower  case  boldface  letters,  respectively. 

II.  Previous  Approaches 

In  this  section,  we  briefly  introduce  previously  developed  ap¬ 
proaches  for  target  detection  in  HSI.  Specifically,  we  describe 
problem  formulation  of  support  vector  machines  (SVMs),  fol¬ 
lowed  by  the  signal  models  and  detector  expressions  of  the  clas¬ 
sical  detectors  including  spectral  matched  filter  (SMF),  matched 
subspace  detectors  (MSDs),  and  adaptive  subspace  detectors 
(ASDs).  Implementation  details  of  the  three  statistical  detec¬ 
tors  and  their  nonlinear  (kernel)  versions  can  be  found  in  [23], 
whereas  details  of  S  VM  can  be  found  in  [24] . 

A.  Support  Vector  Machines 

The  SVM  approach  [8]  solves  the  supervised  binary  classifi¬ 
cation  problem  by  seeking  the  optimal  hyperplane  that  separates 
two  classes  with  the  largest  margin.  A  nonlinear  SVM  (called 
kernel  SVM)  is  often  implemented  to  further  improve  the  sepa¬ 
ration  between  classes  by  projecting  the  samples  onto  a  higher 
dimensional  feature  space.  In  kernel  SVM,  the  dot  products  in 
the  original  SVM  formulation  are  replaced  by  a  nonlinear  kernel 
function  using  the  kernel  trick  [8]. 

It  has  also  been  shown  that  the  integration  of  the  contex¬ 
tual  information  via  composite  kernels  in  SVM  (i.e.,  contextual 
SVM)  leads  to  an  improvement  in  HSI  classification  over  the 
traditional  spectral-only  SVM  [24],  [25].  In  contextual  SVM, 
a  pixel  X{  is  redefined  as  a  combination  of  the  spectral  pixel 
xf  and  its  spatial  feature  x\  (e.g.,  the  mean  and  standard  de¬ 
viation  per  spectral  band)  extracted  in  a  small  neighborhood. 
In  this  paper,  we  implemented  contextual  SVM  with  a  com¬ 
posite  kernel  that  fuses  the  spectral  and  spatial  information  via 
a  weighted  summation 


K =  pKs  {xi,xs3)  +  (1  -  p)Kw  (x?,xy)  (1) 


where  p  £  [0, 1]  is  the  tradeoff  between  spatial  kernel  Ks  and 
spectral  kernel  Kw .  Examples  of  possible  kernels  can  be  found 
in  [26]. 

B.  Spectral  Matched  Filter 

Let  x  —  [  x\  X2  •••  xb]T  be  a  spectral  observation  con¬ 
sisting  of  B  spectral  bands.  The  model  for  SMF  can  be  ex¬ 
pressed  by 


Hq  :  x  =  n,  target  absent 

Hi  :  x  =  as  +  n,  target  present  (2) 

where  a  is  the  target  abundance  measure  (a  =  0  when  no 
target  is  present  and  a  >  0  when  a  target  is  present),  s  = 
[  si  S2  •  •  •  sb  ]T  is  the  spectral  signature  of  the  target,  and 
n  is  the  additive  background  noise. 

Assume  n  is  zero-mean  Gaussian  random  noise.  Using  the 
generalized  likelihood  ratio  test  (GLRT),  the  output  of  SMF  for 
a  test  input  x  is  given  by  [12] 

sTC  x 

^smfW  =  — A  _i  (3) 

sTC  s 

where  C  represents  the  estimated  covariance  matrix  for  the  cen¬ 
tered  observation  data.  If  the  output  Dsmf(x)  is  greater  than  a 
prescribed  threshold  6,  then  the  test  sample  will  be  determined 
as  a  target;  otherwise,  it  will  be  labeled  as  background. 

Variations  of  SMF  include  the  adaptive  SMF  (ASMF)  where 
the  background  clutter  covariance  matrix  is  estimated  from  a 
small  number  of  samples  in  the  neighborhood  of  the  test  sample 
and  the  regularized  SMF  [27]  where  a  regularization  term 
is  added  to  force  the  filter  coefficients  to  shrink  and  become 
smooth.  The  regularized  SMF  is  implemented  in  Section  IV  for 
detector  performance  comparison. 

C.  Matched  Subspace  Detectors 

In  the  previous  SMF  approach,  only  a  single  target  spectral 
signature  is  used.  However,  in  MSD,  a  pixel  is  modeled  in  terms 
of  target  subspace  and  background  subspace  which  are  obtained 
using  target  and  background  training  data,  respectively.  The 
target  detection  set-up  for  MSD  is 

Ho  :  x  =  BQ  +  n,  target  absent 

Hi  :  x  —  TO  +  +  n,  target  present  (4) 


where  B  and  T  represent  matrices  whose  columns  are  linearly 
independent  and  span  the  background  and  target  subspaces,  re¬ 
spectively;  £  and  6  are  unknown  vectors  whose  entries  are  co¬ 
efficients  accounting  for  the  abundances  of  the  corresponding 
column  vectors  of  B  and  T,  respectively;  and  n  is  additive 
Gaussian  noise. 

The  GLRT  for  the  above  model  is  [14] 


Dmsd(x)  — 


xt(I-Pb)x 

xT(I-Ptb)x 


(5) 


where  Pb  is  the  projection  matrix  associated  with  the  back¬ 
ground  subspace  ( B ),  and  Ptb  is  the  projection  matrix  associ¬ 
ated  with  the  target-and-background  subspace  (TB).  Usually, 
the  eigenvectors  corresponding  to  the  significant  eigenvalues  of 
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the  target  and  background  covariance  matrices  are  used  to  gen¬ 
erate  the  columns  of  T  and  B ,  respectively.  For  a  prescribed 
threshold  6 ,  if  the  output  Dmsd(^)  >  S,  then  x  will  be  labeled 
as  target;  otherwise,  it  will  be  labeled  as  background. 

D.  Adaptive  Subspace  Detectors 

A  scaled  background  noise  under  Hi  is  used  in  ASD  because 
in  the  case  of  subpixel  targets,  the  amount  of  background  cov¬ 
ered  area  may  be  different  from  that  of  a  pure  background  pixel. 
For  ASD,  the  detection  model  for  a  measurement  x  is 

Ho  :  x  —  n,  target  absent 

Hi  :  x  —  U9  +  cm,  target  present  (6) 


where  U  is  a  matrix  whose  columns  are  linearly  independent 
vectors  that  span  the  target  subspace  {?/},  6  is  an  unknown 
vector  of  the  abundances  of  the  corresponding  columns  of  ?7, 
n  is  Gaussian  random  noise,  and  a  is  a  scalar.  The  measure¬ 
ment  x  is  assumed  to  be  background  noise  under  hypothesis  Ho 
and  a  linear  combination  of  a  target  subspace  signal  and  scaled 
background  noise  under  hypothesis  H\. 

The  GLRT  for  the  above  problem  is  given  by  [15] 


ASD  (®)  = 


xp?  V(uTc  1uyppp  lx 

xTC  lx 


(7) 


where  C  is  the  estimated  background  covariance.  Similar  to  the 
cases  of  SMF  and  MSD,  if  Dasd(^)  >  6,  then  x  will  be  de¬ 
clared  as  target;  otherwise,  it  will  be  labeled  as  background. 

III.  Sparsity-Based  Target  Detection 

In  this  section,  we  introduce  the  first  sparsity-based  HSI 
target  detection  algorithm  by  sparsely  representing  the  test 
sample  using  a  structured  dictionary  consisting  of  target  and 
background  training  samples.  We  first  describe  the  details  of 
the  sparse  subspace  model  employed  in  the  proposed  algorithm, 
and  then  demonstrate  its  ability  as  a  classifier. 

A.  Sparsity  Model 

Let  a;  be  a  hyperspectral  pixel  observation,  which  is  a  B -di¬ 
mensional  vector  whose  entries  correspond  to  responses  to  var¬ 
ious  spectral  bands.  If  a;  is  a  background  pixel,  its  spectrum  ap¬ 
proximately  lies  in  a  low-dimensional  subspace  spanned  by  the 
background  training  samples  The  pixel  x  can 

then  be  approximately  represented  as  a  linear  combination  of 
the  training  samples  as  follows: 

x  «  aia\  +  a2a\  H - b  aNbahNb 

=  [«1  “2  •••  «nJ[Q  1  a2  ajV^T 

N - v - — - V - ' 

Ab  at 

=  Aba  (8) 

where  Nb  is  the  number  of  background  training  samples,  Ab 
is  the  B  x  Nb  background  dictionary  whose  columns  are  the 
background  training  samples  (also  called  atoms),  and  a  is  an 
unknown  vector  whose  entries  are  the  abundances  of  the  corre¬ 
sponding  atoms  in  Ab .  In  our  model,  a  turns  out  to  be  a  sparse 
vector  (i.e.,  a  vector  with  only  few  nonzero  entries).  To  better 


(C) 


Fig.  1 .  Example  of  sparse  representation  of  a  background  pixel,  (a)  The  original 
pixel  x  (blue  solid)  and  its  approximation  Aba  represented  by  four  training 
samples  in  Ab  (red  dashed).  The  MSE  between  x  and  Ab a  is  7.25  x  10-6. 
(b)  The  sparse  representation  a  of  x.  (c)  The  four  background  training  spectral 
signatures  corresponding  to  the  non-zero  entries  of  a. 


illustrate  this  model,  an  example  is  shown  in  Fig.  1.  A  back¬ 
ground  sample  x  consisting  of  B  —  150  bands  (blue  solid) 
and  its  approximation  Abat  (red  dashed)  are  shown  in  Fig.  1(a). 
The  background  dictionary  contains  Nb  —  1300  training  sam¬ 
ples  which  are  randomly  picked  from  the  entire  image  including 
spectral  signature  for  multiple  background  materials  (e.g.,  vege¬ 
tation,  dirt  road,  and  soil).  The  sparse  representation  a  is  shown 
in  Fig.  1(b).  We  see  that  only  4  out  of  the  1300  entries  of  a  are 
nonzero.  The  four  atoms  (background  training  samples)  of  Ab 
corresponding  to  the  nonzero  entries  are  shown  in  Fig.  1(c).  The 
test  sample  x  is  approximated  by  a  linear  combination  of  only 
four  training  atoms  with  a  small  reconstruction  error  of  mean 
squared  error  (MSE)  =  7.25  x  10-6. 

Similarly,  a  target  pixel  x  approximately  lies  in  the  target  sub¬ 
space  spanned  by  the  target  training  samples  {a^}i=i,2,...,7Vt, 
which  can  also  be  sparsely  represented  by  a  linear  combination 
of  the  training  samples 


X  —  f3ia\  +  /?2«2  +  *  •  •  +  $Nt ® 7Vt 

=  [«1  «2  •••  «jvJ[A  02 

' - v - - - 

At 

=  At0 


(9) 


where  Nt  is  the  number  of  target  training  samples,  At  is  the 
B  x  Nt  target  dictionary  consisting  of  the  target  training  pixels, 
and  f3  is  a  sparse  vector  whose  entries  contain  the  abundances 


125 


fc> 


by  training  samples  from  both  background  and  target  classes. 
However,  in  the  case  of  MSD,  the  target  and  background  are 
assumed  to  have  a  Gaussian  distribution  and  GLRT  is  used  to 
develop  the  detector.  In  our  sparsity-based  model,  no  assump¬ 
tion  about  the  target  and  background  distributions  is  required. 
Also,  in  the  MSD  signal  model,  the  columns  of  the  background 
and  target  dictionaries  have  to  be  independent  in  order  to 
generate  the  required  projection  operators.  In  our  approach, 
the  subspace  model  is  more  generalized  since  independence 
between  the  training  samples  is  not  necessary.  The  vector 
7  is  a  concatenation  of  the  two  vectors  associated  with  the 
background  and  target  dictionaries  and  is  also  a  sparse  vector 
as  follows.  Since  the  background  (e.g.,  trees,  grass,  road,  soil) 
and  target  (e.g.,  metal,  paint,  glass)  pixels  usually  consist  of 
different  materials,  they  have  distinct  spectral  signatures  and 
thus  the  spectrum  of  target  and  background  pixels  lie  in  dif¬ 
ferent  subspaces.  For  example,  if  a;  is  a  target  pixel,  then  ideally 
it  cannot  be  represented  by  the  background  training  samples. 
In  this  case,  af  is  a  zero  vector  and  0  is  a  sparse  vector.  On 
the  other  hand,  if  x  belongs  to  the  background  class,  then  af  is 
sparse  and  0  is  a  zero  vector.  Therefore,  the  test  sample  x  can 
be  sparsely  represented  by  combined  background  and  target 
dictionaries,  and  the  locations  of  nonzero  entries  in  the  sparse 
vector  7  actually  contains  critical  information  about  the  class 
of  the  test  sample  x.  Next,  we  demonstrate  how  to  obtain  7  and 
how  to  label  the  class  of  a  test  sample  from  7. 


Fig.  2.  Example  of  sparse  representation  of  a  target  test  sample,  (a)  The  original 
pixel  x  (blue  solid)  and  reconstructed  pixel  Atf3  represented  by  four  training 
samples  in  At  (red  dashed).  The  MSE  between  x  and  At(3  is  1.70  X  10-6. 
(b)  The  sparse  representation  (3  of  x.  (c)  The  four  target  training  spectral  signa¬ 
tures  corresponding  to  the  nonzero  entries  of  (3. 

of  the  corresponding  target  atoms  in  Af.  An  example  demon¬ 
strating  the  effectiveness  of  this  sparse-representation  model  is 
depicted  in  Fig.  2.  The  target  dictionary  has  Nt  —  18  training 
samples.  Note  that  because  of  the  lack  of  availability  of  the 
target  spectral  signatures,  the  size  of  the  training  dictionary  for 
targets  is  usually  much  smaller  than  that  of  the  background  dic¬ 
tionary.  Fig.  2(a)  shows  the  original  target  spectral  (blue  solid) 
and  its  approximation  (red  dashed)  from  four  training  atoms. 
The  sparse  vector  f3  is  shown  in  Fig.  2(b),  and  the  atoms  in  At 
corresponding  to  the  nonzero  entries  of  j3  are  shown  in  Fig.  2(c). 

In  our  proposed  detection  algorithm,  an  unknown  test  sample 
is  modeled  to  lie  in  the  union  of  the  background  and  target  sub¬ 
spaces.  Therefore,  by  combining  the  two  dictionaries  A b  and 
At ,  a  test  sample  x  can  be  written  as  a  sparse  linear  combina¬ 
tion  of  all  training  pixels 

x  —  Aba ■'  +  At0  —  [Af,  At  ] 

V  ^ 

A 

7 

where  A  —  [Ab  At  ]  is  a  B  x  (Nb  +  Nt)  matrix  con¬ 
sisting  of  both  background  and  target  training  samples,  and 
7  =  [ afT  0T]T  is  a  (Nb  +  Nt) -dimensional  vector  con¬ 
sisting  of  the  two  vectors  af  and  0  associated  with  the  two 
dictionaries.  This  model  is  similar  to  that  of  the  MSD  in  (4) 
where  the  test  sample  is  assumed  to  lie  in  a  subspace  spanned 


a 

0 


=  Aj  (10) 


B.  Reconstruction  and  Detection 

This  section  considers  the  reconstruction  problem  of  finding 
the  sparse  vector  7  for  a  test  sample  x ,  given  the  dictionary  A. 
As  discussed  above,  a  test  sample  can  be  approximately  rep¬ 
resented  by  very  few  training  samples.  Given  the  dictionary  of 
training  samples  A  —  [  Ab  At  ] ,  the  representation  7  satisfying 
A7  =  x  can  be  obtained  by  solving  the  following  optimization 
problem  for  the  sparsest  vector: 

7  =  argmin  ||7||o  subject  to  A^f  —  x  (11) 

where  ||*||o  denotes  l$-noxm  which  is  defined  as  the  number 
of  nonzero  entries  in  the  vector  (also  called  the  sparsity  level 
of  the  vector).  The  above  problem  of  minimizing  the  ^0-norm 
is  a  NP-hard  problem.  If  the  solution  is  sufficiently  sparse, 
this  NP-hard  problem  can  be  relaxed  to  a  linear  programming 
problem  by  replacing  the  £o~norm  by  ^i-norm,  which  can  then 
be  solved  efficiently  by  convex  programming  techniques  [28], 
[29].  Alternatively,  the  problem  in  (11)  can  also  be  approxi¬ 
mately  solved  by  greedy  pursuit  algorithms  such  as  orthogonal 
matching  pursuit  (OMP)  [30]  or  subspace  pursuit  (SP)  [31]. 
Due  to  the  presence  of  approximation  errors  in  empirical  data, 
the  equality  constraint  in  (11)  can  be  relaxed  to  an  inequality 
one 


7  =  argmin  ||7||o  subjectto  \\Aj  -  ar||2  <  cr  (12) 

where  a  is  the  error  tolerance.  The  above  problem  can  also  be 
interpreted  as  minimizing  the  approximation  error  within  a  cer¬ 
tain  sparsity  level 

7  =  argmin \\Aj  —  ar||2  subjectto  \\j\\o  <  K0  (13) 
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where  Kq  is  a  given  upper  bound  on  the  sparsity  level  [32].  In 
[33],  it  has  been  shown  that  the  solutions  to  the  problems  in  (12) 
and  (13)  coincide  for  properly  chosen  parameters  a  and  Kq ,  and 
therefore  the  two  problems  are  in  some  sense  equivalent.  In  this 
paper,  the  greedy  SP  algorithm  [31]  is  used  to  approximately 
solve  the  sparse  recovery  problem  (13)  due  to  its  computational 
efficiency. 

The  sparse  vector  7  is  recovered  by  decomposing  the  pixel 
x  over  the  given  dictionary  A  to  find  the  few  atoms  in  A  that 
best  represent  the  test  pixel  x.  The  recovery  process  implic¬ 
itly  leads  to  a  competition  between  the  two  subspaces.  There¬ 
fore,  the  recovered  sparse  representation  is  naturally  discrimi¬ 
native.  Once  the  sparse  vector  7  is  obtained,  the  class  of  x  can 
be  determined  by  comparing  the  residuals  rb  (x)  =  |  \x  —  Aba \  |  2 
and  rt(x)  =  ||ar  —  At0 H2,  where  a  and  0  represent  the  recov¬ 
ered  sparse  coefficients  corresponding  to  the  background  and 
target  dictionaries,  respectively.  In  our  approach,  the  output  of 
detector  is  calculated  by 

D(x)  =  rb(x)  -  rt(x).  (14) 

If  D(x)  >  8  with  8  being  a  prescribed  threshold,  then  x  is  deter¬ 
mined  as  a  target  pixel;  otherwise,  x  is  labeled  as  background. 

Fig.  3  shows  an  example  of  sparse  reconstruction  of  a  back¬ 
ground  test  sample  and  a  comparison  to  the  pseudo-inverse  re¬ 
construction.  This  example  illustrates  the  advantage  of  f'o-norm 
in  classification  problems  over  the  conventional  f^-norm.  The 
pseudo-inverse  solution  is  obtained  by  solving  the  following 
minimum  £2 -norm  problem: 

72  =  argmin  1 17| 1 2  subject  to  A*y  =  x.  (15) 

The  above  problem  in  (15),  for  the  underdetermined  linear 
system  A*y  —  x ,  has  a  closed-form  solution  72  =  A^x  with  A ^ 
being  the  pseudo-inverse  of  A.  For  a  test  sample  x  and  training 
dictionary  A ,  the  minimum  £Q-novm  vector  7  and  minimum 
£2 -norm  vector  72  are  shown  in  Figs.  3(a)  and  (b),  respectively. 
Blue  and  red  represent  entries  corresponding  to  the  background 
and  target  dictionaries,  respectively.  The  original  test  sample  x 
and  the  partial  reconstructed  pixels  using  only  the  background 
dictionary  xb  =  Aba,  xbf 2  =  Abot2  and  only  the  target  dictio¬ 
nary  xt  —  At0'!  xt,2  —  At 02  are  shown  in  Figs.  3(c)  and  (d). 
Although  the  pseudo-inverse  solution  yields  perfect  recon¬ 
struction,  we  see  that  it  is  not  sparse  and  its  nonzero  entries 
spread  over  both  classes.  Thus,  72  cannot  be  used  directly  for 
detection.  The  minimum  £Q-novm  solution  7,  on  the  contrary, 
has  all  of  its  nonzero  entries  concentrated  in  the  background 
part,  which  indicates  that  the  test  sample  lies  in  the  background 
subspace.  Furthermore,  with  the  pseudo-inverse  solution  72,  as 
seen  in  Fig.  3(d),  neither  Ab6t2  nor  At02  accurately  approxi¬ 
mates  the  original  pixel,  leading  to  a  small  difference  between 
the  residuals  r^2(x)  =  0.568  and  2(2)  =  0.516.  Hence, 
the  solution  72  cannot  be  used  to  determine  the  class  of  the 
input  solely  based  on  the  residuals.  On  the  other  hand,  the 
residuals  associated  with  the  minimum  f^-norm  solution  7  are 
rb(x)  —  0.054  <C  rt(x)  —  1  (i.e.,  the  original  pixel  x  is  well 
approximated  by  the  background  dictionary).  Clearly,  a;  is  a 
background  pixel  using  the  minimum  £Q-norm  solution. 


Fig.  3.  Example  of  sparse  reconstruction  of  a  background  test  sample  with  a 
comparison  to  the  minimum  G-norm  (pseudoinverse)  solution,  (a)  Minimum 
€0-norm  solution  7.  (b)  Pseudo-inverse  solution  j2  —  A^x.  (c)  Minimum 
fo-norm  reconstruction  from  the  background  dictionary  xb  —  Abat  (blue 
dashed),  reconstruction  from  the  target  dictionary  xt  =  At0  (red  dashed),  and 
the  original  test  sample  x  (black  solid),  (d)  Pseudo-inverse  reconstruction  from 
the  background  dictionary  xb>2  —  Aba2  (blue  dashed),  reconstruction  from 
the  target  dictionary  xt>2  —  At$2  (red  dashed),  and  the  original  test  sample  x 
(black  solid). 


C.  Background  and  Target  Dictionary  Construction 

Another  aspect  of  the  problem  that  requires  careful  attention 
is  how  to  construct  appropriate  dictionaries  Ab  and  At.  Global 
dictionaries  for  target  and  background  can  be  designed  using 
given  training  data.  However,  in  target  detection  applications 
there  is  usually  a  lack  of  training  data  especially  for  the  target. 
The  background  is  often  modeled  by  a  subspace  by  using  some 
random  pixels  from  the  test  image.  Furthermore,  a  single  target 
spectral  signature,  as  employed  in  SMF,  is  often  insufficient  to 
represent  a  target  material  as  the  spectrum  is  affected  by  envi¬ 
ronmental  conditions  (e.g.,  illumination  and  atmospheric  vari¬ 
ations).  By  using  physical  models  and  the  MORTRAN  atmo¬ 
spheric-modeling  program  [34],  meaningful  target  spectral  sig¬ 
natures  can  be  generated  which  can  capture  the  target  signature 
appearance  over  a  wide  range  of  atmospheric  conditions.  For  ex¬ 
ample,  in  [21]  a  target  subspace  was  constructed  by  generating  a 
large  number  of  target  signatures  using  MORTRAN  under  var¬ 
ious  atmospheric  conditions.  A  similar  idea  can  be  incorporated 
in  our  approach  to  construct  a  redundant  target  dictionary  which 
could  be  invariant  to  the  environmental  variations.  Furthermore, 
it  can  be  combined  with  the  idea  of  frame  generation  [35],  [36] 
by  imposing  the  constraints  on  tightness,  maximum  robustness, 
equiangularity,  etc.,  to  design  more  desirable  overcomplete  dic¬ 
tionaries.  The  K-SVD  dictionary  design  technique  [37],  which 
alternately  minimizes  sparsity  of  the  representation  and  updates 
the  codebook  to  better  fit  the  data,  can  also  be  used  to  form  the 
redundant  dictionaries  to  further  improve  the  performance  of  the 
proposed  sparsity-based  algorithm. 

In  this  paper,  we  use  a  small  global  target  dictionary  con¬ 
structed  by  using  some  of  the  target  pixels  on  one  of  the  targets 
in  the  scene.  For  the  background  dictionary,  instead  of  using 
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Fig.  4.  Dual  window  centered  at  test  sample  x. 
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a  fixed  global  background  dictionary  containing  samples  from 
various  background  materials  (e.g.,  trees,  grass,  road,  buildings, 
etc.),  we  use  an  adaptive  local  background  dictionary  in  order  to 
better  represent  and  capture  the  spectral  signature  of  test  sample. 
Specifically,  the  background  dictionary  A is  generated  locally 
for  each  test  pixel  using  a  dual  window  centered  at  the  pixel 
of  interest,  as  shown  in  Fig.  4.  The  inner  window  should  be 
larger  than  the  size  of  a  target.  Only  pixels  in  the  outer  region 
form  the  atoms  in  A^.  In  this  way,  the  subspace  spanned  by 
the  background  dictionary  becomes  adaptive  to  the  local  sta¬ 
tistics.  Therefore,  if  the  test  sample  is  a  background  pixel,  it  is 
highly  likely  that  it  finds  very  similar  spectral  characteristic  in 
the  background  dictionary.  On  the  other  hand,  if  the  test  sample 
is  a  target  pixel,  it  would  be  difficult  for  the  pixel  to  find  a  good 
match  in  A since  the  outer  window  region  does  not  include  any 
target  pixels.  The  usage  of  a  dual  window  significantly  improves 
the  detection  performance  over  a  global  background  dictionary, 
as  is  shown  via  the  simulation  results  in  Section  IV. 


D.  Detection  With  Smoothing  Constraint 

In  the  above  process,  the  sparsity-based  target  detector  is 
applied  to  each  pixel  in  the  test  region  independently  without 
considering  the  correlation  between  neighboring  pixels.  Hy¬ 
perspectral  imagery,  however,  is  usually  smooth  in  the  sense 
that  neighboring  pixels  usually  consist  of  similar  materials  and 
have  similar  spectral  characteristics  where  small  differences  are 
often  due  to  sensor  noise  and/or  atmospheric  variation.  In  this 
paper,  we  assume  that  there  are  multiple  pixels  on  the  target. 
Therefore,  we  propose  to  incorporate  a  smoothing  penalty  term 
in  the  proposed  sparsity-based  detector  in  order  to  exploit  the 
spatial  correlation  between  neighboring  pixels. 

Let  a;i  bea  pixel  of  interest  in  a  hyperspectral  image  /,  and 
i  —  2, . . . ,  5  be  its  four  nearest  neighbors  in  the  spatial  do¬ 
main,  as  shown  in  Fig.  5.  While  searching  for  the  sparsest  rep¬ 
resentation  of  the  test  sample  a;i,  we  simultaneously  minimize 
the  vector  Laplacian  at  the  reconstructed  pixel  x±,  which  is  a 
B -dimensional  vector  calculated  as 


V2(£i)  =  If;  1  -  x2  -  £3  -  £4  -  £5 

=  ^(47i-72  -73  -74-75)  (16) 


where  X{  —  Ay{  is  the  reconstruction  of  xL  and  7^  is  the  corre¬ 
sponding  recovered  sparse  vector.  In  this  way,  the  reconstructed 
test  sample  is  forced  to  have  a  similar  spectral  characteristics  as 
its  four  nearest  neighbors;  hence,  smoothness  is  enforced  across 
the  spectral  pixels  in  the  reconstructed  image. 


Fig.  5.  Four  nearest  neighbors  of  a  pixel  xx . 


Let  7^  be  the  sparse  vector  associated  with  xi  (i.e.,  xL  —  Aj{). 
The  new  problem  with  the  smoothing  constraint  can  now  be 
formulated  as 


5 

minimize  Z)ll7ill° 

i=  1 

subject  to:  A(k)l  -  72  -  73  -  74  ~  7s)  =  0 

Xi=A 7i5  i  —  1, ...  ,5.  (17) 


In  (17),  we  aim  to  find  the  sparsest  vector  that  approximately  sat¬ 
isfies  two  sets  of  linear  constraints.  The  first  set  forces  the  vector 
Laplacian  of  the  reconstructed  pixel  x\  to  be  minimal  such  that 
the  reconstructed  neighboring  pixels  have  similar  spectral  char¬ 
acteristics,  and  the  second  set  minimizes  reconstruction  errors. 
Now  denote  the  concatenation  of  7 i  ’  s  and  xi  ’  s  by 


7i 

~Xi~ 

7  — 

and  x  — 

_7s  _ 

-%5- 

The  linear  constraints  can  be  written  in  terms  of  x  and  7  as 


[4 A  —A  —A  —A  —A  ]7  =  0. 
f  A  01 


0 


(19) 


Therefore,  the  optimization  problem  in  (17)  can  be  reformulated 
as 


minimize  ||7||o 

subject  to:  Ay  —  x ,  (20) 


where 
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and 


The  problem  in  (20)  is  the  standard  form  of  a  linearly  con¬ 
strained  sparsity-minimization  problem  and  can  be  solved  using 
the  greedy  SP  algorithm  [31].  Similar  to  the  previous  case  in 
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Fig.  6.  Example  comparing  the  reconstruction  and  detection  problem  for  a 
background  test  sample  without  and  with  the  smoothing  constraint,  (a)  Solution 
to  (11)  (without  smoothing  constraint),  (b)  By  solving  (11),  the  reconstruction 
from  the  background  dictionary  (blue  dashed),  reconstruction  from  the  target 
dictionary  (red  dashed),  and  the  original  test  sample  (black  solid),  (c)  Solution 
to  (20)  (with  smoothing  constraint)  for  the  centered  test  sample,  (d)  By  solving 
(20),  the  reconstruction  from  the  background  dictionary  (blue  dashed),  recon¬ 
struction  from  the  target  dictionary  (red  dashed),  and  the  original  test  sample 
(black  solid). 


(11),  this  problem  can  also  be  relaxed  to  allow  for  approxima¬ 
tion  errors  in  empirical  data  and  be  rewritten  as 


7  =  argmin  ||^y||o  subject  to  11-^7-^112  <  CT  (21) 
or 

7  =  argmin \\Aj  —  x\\2  subjectto  \\j\\o  <  K0  (22) 

where  a  is  the  error  tolerance  and  Kq  is  the  sparsity  level. 

By  exploiting  the  smoothness  across  the  HSI  pixels,  the 
detection  performance  can  be  significantly  improved.  Fig.  6 
shows  an  example  of  a  background  test  sample  which  is 
misclassified  as  a  target  using  (11),  but  is  correctly  labeled 
using  (20)  with  the  smoothing  constraint.  The  solution  to  (11) 
for  the  given  test  sample  x  is  depicted  in  Fig.  6(a).  We  see 
that  the  nonzero  entries  of  the  solution  correspond  to  both 
background  and  target  training  atoms,  and  the  residuals  are 
rb  —  0.5381,  rt  —  0.5088.  In  the  case  with  the  smoothing 
constraint,  by  solving  (20),  the  nonzero  entries  only  concentrate 
on  part  corresponding  to  the  background  dictionary,  and  the 
residuals  are  rb  —  0.0538,  rt  —  1.  Clearly,  rb  <C  rt  and  the  test 
sample  will  thus  be  correctly  labeled  as  a  background  sample. 

Once  the  sparse  vector  in  (20)  is  obtained,  detection  can  be 
performed  based  on  the  characteristics  of  the  sparse  coefficients 
as  it  was  done  in  Section  III-B.  We  calculate  the  total  residuals 
obtained  separately  from  the  target  and  background  dictionaries 
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Fig.  7.  Results  for  Desert  Radiance  II  (DR- II)  from  (20)  with  the  smoothing 
constraint,  (a)  Averaged  image  over  150  bands,  (b)  Sparsity-based  target  de¬ 
tector  output:  difference  between  rh  and  rt.  (c)  Residual  rb  corresponding  to 
the  local  background  dictionary  using  the  dual-window  approach,  (d)  Residual 
rt  corresponding  to  the  target  dictionary. 

and 
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where  dti  and  denote  the  recovered  sparse  coefficients  for  X{ 
associated  with  the  background  and  target  dictionaries,  respec¬ 
tively.  The  output  of  the  proposed  sparsity-based  detector  for 
the  center  pixel  x\  is  computed  by  the  difference  of  residuals 
and  the  detection  decision  is  made  in  a  similar  fashion  as  in  the 
other  algorithms  introduced  in  Section  II: 

D(x i)  =  rb(x i)  -  rt(x i)  $  8.  (24) 

Ho 

That  is,  if  the  output  D  (x± )  is  greater  than  a  prescribed  threshold 
8,  then  the  test  sample  x\  is  labeled  as  a  target;  otherwise  it  is 
labeled  as  background. 


IV.  Simulation  Results  and  Analysis 

The  proposed  target  detection  algorithm,  as  well  as  the  SMF, 
MSD,  ASD,  and  SVM,  are  applied  to  several  real  HSI,  and  the 
results  are  compared  both  visually  and  quantitatively  by  the  re¬ 
ceiver  operating  characteristics  (ROC)  curves.  The  ROC  curve 
describes  the  probability  of  detection  (PD)  as  a  function  of  the 
probability  of  false  alarms  (PFA).  To  be  more  specific,  we  pick 
thousands  of  different  thresholds  between  the  minimal  and  max¬ 
imal  values  of  the  detector  output.  The  class  labels  for  all  pixels 
in  the  test  region  are  determined  at  each  threshold.  The  PFA 
is  calculated  by  the  number  of  false  alarms  (background  pixels 
determined  as  target)  over  the  total  number  of  pixels  in  the  test 
region,  and  the  PD  is  the  ratio  of  the  number  of  hits  (target  pixels 
determined  as  target)  and  the  total  number  of  true  target  pixels. 

Two  of  the  images,  the  desert  radiance  II  data  collection 
(DR-II)  and  forest  radiance  I  data  collection  (FR-I),  are  from  a 
hyperspectral  digital  imagery  collection  experiment  (HYDICE) 
sensor  [38].  The  HYDICE  sensor  generates  210  bands  across 
the  whole  spectral  range  from  0.4  to  2.5  jjm  which  includes  the 
visible  and  short-wave  infrared  bands.  We  use  150  of  the  210 
bands  (23rd-101st,  109th-136th,  and  152nd-194th),  removing 


Fig.  8.  Output  for  DR-II  using  local  background  dictionary  (dual- window  ap¬ 
proach),  with  (a)  sparsity-based  target  detector  without  smoothing  constraint 
using  (11),  (b)  SVM  with  composite  kernel,  (c)  MSD,  (d)  SMF,  and  (e)  ASD. 
(f)  We  repeat  here  the  result  of  our  proposed  sparsity -based  target  detector  with 
smoothing  constraint  for  visual  comparison. 


Fig.  9.  Results  for  forest  radiance  I  (FR-I)  from  (20)  with  smoothing  constraint, 
(a)  Averaged  image  over  150  bands,  (b)  Sparsity-based  target  detector  output: 
difference  between  rb  and  rt.  (c)  Residual  rb  corresponding  to  the  background 
dictionary  (dual- window  approach),  (d)  Residual  rt  corresponding  to  the  target 
dictionary. 


the  absorption  and  low-SNR  bands.  The  DR-II  image  contains 
six  military  target  on  the  dirt  road  and  the  FR-I  image  contains 
14  targets  along  the  tree  line  as  depicted  in  Figs.  7(a)  and  9(a), 
respectively.  For  these  two  HYDICE  images,  every  pixel  on  the 
targets  is  considered  a  target  pixel.  The  third  image,  collected 
from  the  Airborne  Hyperspectral  Imager  (AHI)  [39]  operating 
in  the  long- wave  infrared  spectrum  ranging  from  8  to  11.5  fi m, 
contains  surface  and  buried  mines  as  shown  in  Fig.  11(a),  in 
which  every  pixel  has  70  spectral  bands.  In  this  image,  there 
are  about  230  mines,  each  roughly  of  size  5x5  pixels  and  each 
mine  is  treated  as  a  target  when  computing  the  PD. 

For  DR-II  and  FR-I,  the  spectral  signatures  of  the  target 
are  collected  directly  from  Nt  —  18  pixels 
from  the  leftmost  target  in  the  given  hyperspectral  data.  The 
background  signatures  {a^}i= i,...,Nb  are  generated  by  the 
pixels  in  the  outer  region  of  a  dual  window  as  discussed  in 


Fig.  10.  Output  for  FR-I  using  local  background  dictionary  (dual- window  ap¬ 
proach),  with  (a)  sparsity-based  target  detector  without  smoothing  constraint 
using  (11),  (b)  SVM  with  composite  kernel,  (c)  MSD,  (d)  SMF,  and  (e)  ASD. 
(f)  We  repeat  here  the  result  of  our  proposed  sparsity-based  target  detector  with 
smoothing  constraint  for  visual  comparison. 


Fig.  11.  Results  for  the  mine  image  from  (20)  with  smoothing  constraint, 
(a)  Averaged  image  over  70  bands,  (b)  Detector  output:  difference  between 
rb  and  rt.  (c)  Residual  rb  corresponding  to  the  background  dictionary 
(dual- window  approach),  (d)  Residual  rt  corresponding  to  the  target  dictionary. 


Section  III.  The  size  of  the  outer  and  inner  windows  are  21x21 
and  15  x  15,  respectively,  and  there  are  TV*,  —  216  background 
training  samples.  The  subspace  pursuit  algorithm  [31]  is  used 
to  solve  the  sparsity-constrained  problems  (11)  and  (20).  The 
results  of  the  proposed  detector  with  the  smoothing  constraint 
for  DR-II  are  shown  in  Fig.  7(b)-(d).  Fig.  7(c)  and  (d)  shows 
the  residuals  corresponding  to  the  background  dictionary 
rb(x)  —  \\x  —  A&q:||2,  and  the  residual  corresponding  to  the 
target  dictionary  rt(x)  —  ||ar  —  Atf3 1 1 2,  respectively,  whereas 
Fig.  7(b)  shows  the  difference  between  r &  and  rt.  In  Fig.  7(c), 
while  background  pixels  are  dark,  the  target  pixels  are  bright 
due  to  the  fact  that  for  each  target  pixel  the  sparsity-constrained 
optimizer  could  not  find  good  matches  from  the  background 
dictionary;  therefore,  the  sparse  vector  a  «  0  and  the  residual 
associated  with  the  background  dictionary  is  rb(x)  «  ||ar||.  On 
the  contrary,  in  Fig.  7(d),  the  targets  are  dark  while  the  back¬ 
ground  are  bright.  Finally,  as  shown  in  Fig.  7(b),  the  difference 
between  r*>  and  rt  will  further  suppress  the  background  and 
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Fig.  12.  Output  for  the  mine  image  using  local  background  dictionary  (dual¬ 
window  approach),  with  (a)  sparsity-based  target  detector  without  smoothing 
constraint,  (b)  SVM  with  composite  kernel,  (c)  MSD,  (d)  SMF,  and  (e)  ASD. 
(f)  We  repeat  here  the  result  of  our  proposed  sparsity -based  target  detector  with 
smoothing  constraint  for  visual  comparison. 

emphasizes  the  targets,  thus  yielding  better  detection  perfor¬ 
mance.  Similar  results  can  be  seen  in  Fig.  9(b)-(d)  for  the  FR-I 
image.  In  Fig.  9(c)  which  represents  the  residual  image  r*>, 
although  the  targets  are  bright,  we  can  also  see  the  shadow  of 
trees  near  the  upper  and  right  borders  of  the  image  has  higher 
magnitude  than  the  other  background  areas.  In  Fig.  9(b),  the 
shadow  is  suppressed  and  this  improves  the  false  alarm  rate. 

Similar  results  can  be  observed  in  Fig.  11  for  the  mine 
image,  where  the  target  dictionary  At  is  generated  from 
Nt  =  50  training  samples  of  two  mines,  each  occupying  a 
5x5  area,  outside  the  test  region.  Since  the  targets  in  this 
image  are  smaller  than  that  of  the  two  HYDICE  images,  the 
inner  window  size  is  chosen  to  be  9  x  9  and  the  outer  window 
size  remains  21  x  21.  The  background  dictionary  A b  then 
consists  of  TV*,  =  360  samples. 

Next  we  demonstrate  the  importance  of  employing  a  locally 
adaptive  background  dictionary.  The  sparsity-based  target  de¬ 
tection  algorithm  is  applied  to  the  DR-II  and  FR-I  images  using 
local  and  global  background  dictionaries.  The  local  Ab  is  gener¬ 
ated  by  Nb  =  216  pixels  in  the  outer  region  of  the  dual  window 
centered  at  the  test  sample  as  in  Fig.  4,  and  the  global  dictionary 
C Nb  =  1300  for  DR-II  and  Nb  =  1656  for  FR-I)  is  generated  by 
randomly  collecting  background  pixels,  which  can  be  reduced 
to  a  smaller  size  by  an  unsupervised  clustering  algorithm  such  as 
K-means.  The  detection  performance  is  significantly  improved 
by  using  a  local  dictionary,  as  seen  in  the  ROC  curves  shown 
in  Fig.  13.  This  is  because  a  fixed  global  dictionary  fails  to  cap¬ 
ture  the  local  similarity  between  pixels  in  a  small  neighborhood. 
A  local  dictionary  exploits  the  local  statistics  and  helps  to  find 
better  resemblance  of  test  samples.  We  see  in  Fig.  13  that  the  de¬ 
tector  using  local  dictionaries  outperforms  the  one  using  global 
dictionaries  by  a  large  margin  for  both  HYDICE  images. 

Under  the  same  settings  (i.e.,  same  target  and  background 
training  samples  for  all  detectors),  we  compare  the  performance 
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Fig.  13.  ROC  curves  using  the  sparsity-based  target  detector  with  smoothing 
constraint  for  (a)  DR-II  and  (b)  FR-I  with  local  and  global  background 
dictionaries. 

of  the  proposed  sparsity-based  algorithm  to  the  previously  de¬ 
veloped  conventional  classifier  SVM  and  detectors  MSD,  SMF, 
ASD  using  both  global  and  local  background  dictionaries.  Let 
At  —  [a\  ■■■  afNt  ]  and  Ab  -  [a\  ■■■  ahNi  ]  be,  respec- 

tively,  the  target  and  background  dictionaries  used  in  the  pro¬ 
posed  sparsity-based  algorithm.  Note  that  in  the  local  case,  Ab 
is  adaptive  and  changes  for  each  test  pixel.  In  order  to  have  a 
fair  comparison,  in  the  case  of  SMF  the  target  signature  is  the 
mean  of  the  target  dictionary  atoms  {a\}i= and  the  back¬ 
ground  covariance  is  obtained  from  the  background  dictionary 
Ab.  In  the  SMF  implementation,  a  regularization  term  is  added 
to  the  background  covariance  matrix  such  that  the  inverse  ma¬ 
trix  C  in  (3)  is  more  stable,  as  described  in  [27] .  In  the  case  of 
MSD,  the  eigenvectors  corresponding  to  the  significant  eigen¬ 
values  of  the  covariance  matrices  obtained  from  atoms  in  At  and 
Ab  are  used  to  generate  the  basis  for  the  target  and  background 
subspaces,  respectively  [23].  For  ASD,  the  basis  for  target  sub¬ 
space  are  generated  in  the  same  way  as  in  MSD.  The  ASD  noise 
covariance  matrix  is  computed  from  the  background  training 
samples  {a^}i=i,...,jv&  and  a  regularization  term  is  added  to  the 
noise  covariance  matrix  in  order  to  obtain  a  stable  inverse  ma¬ 
trix.  In  SVM,  a  model  is  trained  using  atoms  in  Ab  and  At  as  two 
different  classes  using  a  composite  kernel  which  combines  the 
spectral  and  spatial  feature  via  a  weighted  summation,  where  Ks 
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Fig.  14.  ROC  curves  for  DR-II.  (a)  Global  background  dictionary,  Nb  —  1 300 . 
(b)  local  background  dictionary  (dual-window  approach),  Nb  —  216. 

and  Kw  in  (1)  are  radial  basis  function  kernels  [9].  All  parame¬ 
ters  are  adjusted  to  obtain  the  best  possible  performance.  Under 
the  current  setting  of  target  and  background  dictionaries,  the 
proposed  detector  has  computational  complexity  comparable  to 
that  of  the  classical  detectors  SMF,  MSD,  and  ASD. 

The  ROC  curves  in  both  the  global  and  local  cases  for  DR-II 
are  shown  in  Fig.  14.  We  see  that  the  sparsity-based  detector 
with  the  smoothing  term  using  a  local  background  dictionary 
outperforms  all  other  detectors.  The  SMF  performs  poorly  since 
the  target  signature  is  represented  by  a  single  vector,  while  in  all 
other  approaches  the  targets  are  assumed  to  approximately  lie 
in  a  subspace.  For  visual  comparison,  the  detector  outputs  for 
SVM,  MSD,  SMF,  and  ASD  are  also  displayed  in  Fig.  8,  where 
the  locally  adaptive  background  dictionary  is  employed.  One 
can  immediately  notice  that  the  sparsity-based  detector  with  the 
smoothing  constraint  also  leads  to  the  best  visual  quality. 

The  ROC  curves  for  FR-I  are  shown  in  Fig.  15.  The  FR-I 
image  is  more  difficult  than  the  DR-II  due  to  the  presence  of 
the  trees  and  shadow  whose  spectral  curves  have  some  resem¬ 
blance  to  that  of  the  targets.  From  the  ROC  plots,  the  proposed 
algorithm  still  leads  to  the  best  performance.  For  visual  inspec¬ 
tion,  the  detection  results  obtained  by  SVM,  MSD,  SMF,  and 
ASD  are  illustrated  in  Fig.  10.  For  all  detectors  in  Fig.  10,  we 
can  see  the  bright  spots  in  the  shadow  area  along  the  tree  line. 


Fig.  15.  ROC  curves  for  FR-I.  (a)  Global  background  dictionary,  Nb  —  1656. 
(b)  local  background  dictionary  (dual-window  approach),  Nb  —  216. 


Fig.  16.  ROC  curves  for  the  mine  image  using  local  background  dictionary 
(dual-window  approach),  Nb  =  360. 

This  is  alleviated  by  the  proposed  detection  algorithm,  as  seen 
in  Fig.  9(b). 

The  AHI  image  of  mines  is  the  most  difficult  one  among  the 
three  test  images.  The  targets  include  surface  mines  and  buried 
mines  that  are  invisible.  In  this  case,  the  ROC  curve  is  obtained 
slightly  differently  in  that  only  one  pixel  on  the  mine  needs 
to  be  correctly  labeled  for  the  mine  to  be  declared  as  a  target. 
Therefore,  the  PD  is  calculated  by  the  number  of  hits  divided 
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by  the  total  number  of  mines  in  the  test  region.  In  this  experi¬ 
ment,  for  all  detectors,  we  use  Nt  =  50  target  training  samples 
from  two  mines  outside  the  test  region  and  TV b  —  360  back¬ 
ground  training  samples  adaptively  constructed  for  each  test 
pixel  by  the  dual-window  approach  with  inner  and  outer  win¬ 
dows  of  size  9x9  and  21  x  21,  respectively.  The  ROC  curves 
for  the  mine  image  using  local  dictionaries  are  shown  in  Fig.  16. 
The  proposed  sparsity -based  target  detection  algorithm  still  out¬ 
performs  the  other  algorithms,  especially  at  low  PFA.  The  out¬ 
puts  for  SVM,  MSD,  SMF,  and  ASD  are  displayed  as  images  in 
Fig.  12.  We  see  that  although  the  MSD  yields  higher  PD  at  cer¬ 
tain  PFA,  there  is  a  large  background  area  in  the  middle  of  the 
image  where  pixels  have  very  high  magnitude,  hence  increasing 
the  number  of  false  alarms. 

V.  Conclusion 

In  this  paper,  we  propose  a  target  detection  algorithm  for  hy- 
perspectral  imagery  based  on  sparse  representation  of  the  test 
samples.  In  the  proposed  algorithm,  the  sparse  representation  is 
recovered  by  solving  a  constrained  optimization  problem  that 
simultaneously  addresses  the  sparsity  constraint,  reconstruction 
accuracy,  and  a  smoothness  penalty  on  the  reconstructed  image. 
Detection  decision  is  obtained  from  the  recovered  sparse  vectors 
by  reconstruction.  The  new  algorithm  consistently  outperforms 
the  previously  developed  detectors  in  terms  of  both  qualitative 
and  quantitative  measures,  as  demonstrated  by  experimental  re¬ 
sults  in  several  real  hyperspectral  images.  Future  research  in¬ 
cludes  the  construction  of  better  dictionaries.  For  example,  the 
proposed  detector  can  be  improved  by  generating  dictionaries 
invariant  to  the  effect  of  atmospheric  absorption  [21].  We  will 
also  investigate  the  design  and  exploitation  of  more  discrimina¬ 
tive  dictionaries  learned  from  the  training  data  [37],  [40]. 
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This  work  describes  the  design  and  application  of  an  apparatus  to  image  aerosol 
particles  using  digital  holography  in  a  flow-through,  contact-free  manner.  Particles  in 
an  aerosol  stream  are  illuminated  by  a  triggered,  pulsed  laser  and  the  pattern  produced 
by  the  interference  of  this  light  with  that  scattered  by  the  particles  is  recorded  by  a 
digital  camera.  The  recorded  pattern  constitutes  a  digital  hologram  from  which  an 
image  of  the  particles  is  computationally  reconstructed  using  a  fast  Fourier  transform. 
This  imaging  is  validated  using  a  cluster  of  ragweed  pollen  particles.  Examples 
involving  mineral-dust  aerosols  demonstrate  the  technique’s  in  situ  imaging  capability 
for  complex-shaped  particles  over  a  size  range  of  roughly  15-500  pm  micrometers.  The 
focusing-like  character  of  the  reconstruction  process  is  demonstrated  using  a  NaCl 
aerosol  particle  and  is  compared  to  a  similar  particle  imaged  with  a  conventional 
microscope. 

©  2011  Elsevier  Ltd.  All  rights  reserved. 


1.  Introduction 

The  in  situ  characterization  of  small  aerosol  particles  is 
a  persistent  objective  in  applied  contexts.  Examples 
include  the  determination  of  atmospheric  aerosol  compo¬ 
sition  for  climate  modeling  and  the  detection  of  biological 
weapons  agents  for  defense  applications.  Countless  mea¬ 
surements  and  calculations  of  single  and  multiple-particle 
scattering  patterns  can  be  found  in  the  literature.  The 
overall  goal  of  such  work  is  to  infer  information  relating 
to  the  particles’  physical  form,  such  as  size  and  shape,  by 
analyzing  the  angular  structure  of  these  patterns,  e.g. 
see  [1].  Unfortunately,  a  fundamental  limitation  of  this 
approach  is  the  absence  of  an  unambiguous  quantitative 
relationship  between  a  pattern  and  the  corresponding 
particle  properties,  i.e.,  the  so-called  inverse  problem. 
Consequently,  the  inference  of  these  properties  from  the 
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patterns  has  proved  to  be  very  difficult  in  practice,  except 
for  the  simplest  of  cases. 

Ideally,  one  would  prefer  to  image  the  particles 
directly,  thus  eliminating  the  complexity  and  ambiguity 
associated  with  interpretation  of  the  scattering  patterns. 
However,  the  typical  particle  size  range  of  interest  for 
many  applications  is  roughly  0.1-10  pm  [1,2].  Because  of 
this,  direct  images  are  possible  in  part  of  this  range  only 
with  high  numerical-aperture  (NA)  optics  and  corre¬ 
spondingly  small  focal  volumes.  This  typically  requires 
collection  and  immobilization  of  particle  samples,  and 
thus,  such  imaging  is  not  a  practical  technique  for  particle 
characterization  in  applications  requiring  high  sample 
through-put  or  images  of  the  particles  in  their  undis¬ 
turbed  form,  i.e.,  in  situ  images. 

Holography  is  an  alternative  technique  that  combines 
useful  elements  of  both  conventional  imaging  and  scat¬ 
tering.  Fundamentally,  this  is  a  two-step  process:  First,  an 
object  is  illuminated  with  coherent  light  and  the  intensity 
pattern  resulting  from  the  interference  of  this  light  with 
that  scattered  by  the  particle  is  recorded.  This  pattern 
constitutes  the  hologram,  from  which  an  image  of  the 
object  is  reconstructed.  Traditionally,  holograms  are 
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recorded  with  photographic  film  due  to  the  film’s  high 
resolution,  which  is  required  to  capture  the  finer  features 
of  the  interference  pattern.  The  subsequent  chemical 
development  of  the  film  is  costly  and  time  consuming, 
and  this  greatly  limits  the  practical  utility  of  the  techni¬ 
que.  For  this  reason,  charged  coupled  device  (CCD)  detec¬ 
tors  are  used  to  record  the  interference  pattern  digitally. 
The  resulting  so-called  digital  hologram  can  then  be 
computationally  processed,  rather  than  chemically,  to 
reconstruct  an  image  of  the  object. 

Digital  holographic  imaging  has  been  demonstrated 
in  multiple  small-particle  systems  and  across  visible  and 
X-ray  wavelengths,  see  e.g.  [3-11].  Examples  of  work 
applying  holography  to  aerosols  are  scarce,  and  to  the 
best  of  our  knowledge  have  not  yet  been  reported  for 
in  situ  imaging  of  aerosol  particles  in  the  0.1-25  pm  size 
range  using  visible  light.  This  article  will  describe  the 
design  and  implementation  of  an  apparatus  that  achieves 
imaging  of  particles  approximately  15-500  pm  in  size, 
and  has  the  potential  to  image  particles  as  small  as  4  pm 
given  further  design  optimization.  The  basic  concepts 
involved  are  briefly  reviewed  and  a  validation  measure¬ 
ment  using  ragweed  pollen  particles  is  presented. 
Saharan,  Tunisian,  and  sodium  chloride  (NaCl)  aerosols 
are  used  to  establish  the  in  situ  capability  of  the  appara¬ 
tus.  Finally,  the  microscope-like  focusing  behavior  of  the 
image-reconstruction  process  is  demonstrated  using  a 
single  NaCl  aerosol  particle. 


2.  Digital  in-line  holography 


The  apparatus  in  this  work  is  based  on  the  so-called  in¬ 
line  holographic  configuration  [3].  Flere,  the  particle, 
primary  optical  components,  and  detector  are  all  co- 
linearly  arranged.  The  particle  is  illuminated  by  a  mono¬ 
chromatic  spherical  wave  and  the  resulting  interference 
pattern  formed  by  this  reference  wave  and  the  light 
scattered  by  the  particle  is  recorded  by  a  CCD  detector. 
Let  the  source  of  the  reference  wave  be  located  at  a 
distance  l  from  the  particle  and  the  detector  at  a  distance 
d.  Provided  that  kl  and  kd  are  large  enough  to  satisfy  the 
far- field  conditions  of  [12],  both  the  reference  and  scat¬ 
tered  waves  will  be  transverse  and  spherical  at  the 
detector  and  can  be  represented  entirely  by  their  scatter¬ 
ing  amplitudes 


,ref ,  ,  exp(ikr)  f 


Eret(r) = 


Ef(r),  Esca(r)  = 


■sca/v\ exp {ikr) 


£s,ca(f), 


(1) 


respectively.  Then,  the  intensity  of  the  total  wave  across 
the  detector’s  face  is  [3] 

/holo(r)=  ^|E!jef(f)+Efa(r)|2,  (2) 


where  c  and  s0  are  the  vacuum  speed  of  light  and  electric 
permittivity,  respectively.  Expanding  Eq.  (2)  gives 

/holo(r)  =  ^  {|Er,ef(r)|2  +  |Ef a(r  )|2  +  [E^ef(r)]*Ef  a(r) 

+  [Ef\r)rEf(i)}.  (3) 

The  quantity  C80r~2\E\ef(r)\2  =  /ref(r)  in  Eq.  (3)  is  the 
intensity  across  the  detector  when  no  particle  is  present, 
and  hence  can  be  considered  a  known  quantity  measured 


before  the  introduction  of  an  aerosol  sample.  Subtracting 
this  reference  intensity  from  Eq.  (3)  and  dividing  the 
remaining  terms  by  it  gives 

rcon/rx  _  /hol°  (r) — 7ref  (r) 

Jref(r) 

=  I£sica(r)l2  ,  [Er1ef(f)]*Es1ca(f)  +  [Efa (rWEfjr) 

\Ef(r)\2  \Ef(r)\2 

Often,  the  intensity  of  the  reference  wave  at  the  detector 
is  much  greater  than  that  of  the  scattered  wave.  This  is 
especially  true  in  this  work  where  the  objects  being 
illuminated  are  small  particles,  as  opposed  to  the  macro¬ 
scopic  sized  objects  involved  in  other  applications,  see 
e.g.  [13-16].  This  means  that  the  term  |£fa(r)|2/|£ff(r)l2 
in  Eq.  (4)  can  be  neglected,  leaving 

fconrrx  [F1ef(f)]*P1ca(f)  +  [P1ca(f)]*F1ef (f) 

|Eref(r)|2  • 


This  intensity  pattern,  which  is  the  difference  between 
two  measurements  -  with  and  without  the  particle 
present  -  is  known  as  a  contrast  hologram.  The  key 
characteristic  of  /con  is  its  linear  dependence  on  the 
amplitude  of  the  particle’s  scattered  wave.  This  means 
that  the  phase  of  the  scattered  wave  over  the  detector  is 
encoded  in  the  measurement.  Consequently,  Jcon  can  be 
used  to  reconstruct  unambiguously  an  image  of  the 
particle  that  closely  resembles  that  obtained  from  con¬ 
ventional  microscopy. 

Because  there  are  many  references  describing  the 
theory  behind  digital  holographic  imaging,  only  a  brief 
description  will  be  given  here,  see  e.g.  [17-19].  Basically, 
the  contrast  hologram  is  envisioned  as  a  transmission 
diffraction-grating  illuminated  by  a  normally  incident 
plane  wave,  i.e.,  a  reconstruction  wave.  The  Fresnel- 
Kirchhoff  approximation  is  then  used  to  describe  the  light 
diffracted  from  this  grating  in  a  parallel  plane  separated 
by  a  distance  z  from  the  grating  along  the  z-axis.  If  z 
corresponds  to  the  distance  between  the  particle  and 
detector  during  the  hologram  measurement  (z=d)  the 
resulting  diffraction  pattern  in  this  so-called  reconstruc¬ 
tion  plane  yields  an  image  of  the  particle.  The  image  is 
essentially  equivalent  to  a  conventional  microscope 
image,  although  the  resolution  is  typically  less  [3]. 

The  advantage  of  using  the  Fresnel-Kirchhoff  approx¬ 
imation  to  calculate  the  reconstructed  particle  image  is 
that  the  approximation’s  mathematical  form  is  essentially 
a  discrete  Fourier  transform  of  the  CCD  pixel  values 
constituting  /con.  This  enables  the  use  of  the  fast  Fourier 
transform  (FFT)  in  the  calculation,  thus  substantially 
reducing  the  computation  time  required  to  render  the 
particle  image.  This  is  fortuitous,  because  in  practice  d  is 
not  known  to  great  enough  accuracy  to  be  able  to 
reconstruct  an  image  from  a  single  application  of  the 
reconstruction  routine.  This  inaccuracy  is  due  to  the 
variation  in  particle  positions  in  the  aerosol  stream  as 
they  enter  the  measurement  volume.  Consequently,  the 
image-reconstruction  stage  consists  of  a  focusing-like 
procedure:  First  an  initial  image  is  reconstructed  using 
an  estimate  of  d  based  on  the  experimental  layout.  Then, 
the  reconstruction  plane  is  scanned  along  the  z-axis  in 
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small  steps  until  the  reconstructed  image  comes  into 
focus.  The  ability  to  use  the  FFT  for  each  of  these 
intermediate  steps  is  thus  crucial  to  the  practical  imple¬ 
mentation  of  this  technique. 

The  primary  drawback  to  the  in-line  configuration  is 
that  two  images  of  the  particle  are  produced  in  the 
reconstruction  stage  [17].  The  in-focus  particle  image  is 
always  accompanied  by  a  blurred  twin  image  that  is  in¬ 
focus  in  the  mirror  reconstruction  plane,  i.e.,  at  z=-d.  As 
a  consequence,  the  image  quality  is  degraded.  However, 
as  shown  in  [3],  the  effect  of  the  twin  on  the  in-focus 
image  becomes  negligible  if  both  d  and  the  size  of  the  CCD 
pixel  array  are  sufficiently  large  such  that  an  imaging 
resolution  on  the  order  of  the  wavelength  can  theoreti¬ 
cally  be  achieved  [3,4]. 

Another  drawback  of  in-line  holography  is  the  pre¬ 
sence  of  the  zero  frequency,  or  so-called  DC,  term  in  the 
reconstructed  image  [20].  In  the  diffraction-grating 
model,  the  reconstruction  wave  is  uniform  across  the 
hologram  since  it  is  planar  and  normally  incident.  Upon 
application  of  the  FFT  to  /con,  this  wave  then  becomes  a 
strong  DC  contribution  in  the  transform.  The  result  is  an 
unwanted  bright  spot  in  the  reconstructed  image  located 
at  the  intersection  of  the  optical  axis  (z-axis)  with  the 
reconstruction  plane.  Fortunately,  however,  the  DC  term 
can  be  nearly  eliminated  by  subtracting  from  each  pixel 
value  in  Jcon  the  average  value  of  all  the  pixels  [17].  Notice 
that  in  doing  this  subtraction,  the  result  is  a  new  contrast 
hologram  with  both  positive  and  negative  values; 
whereas,  its  constituent  holograms  /hol°,  Jref,  and  Jcon,  are 
all  inherently  positive  since  they  correspond  to  intensity 
measurements. 

The  resolution  of  the  resulting  particle  images  is 
limited  by  several  factors  related  to  diffraction  and  the 
apparatus  hardware:  the  CCD  pixel  size,  CCD  pixel-array 
size  w,  particle-CCD  distance  d,  and  the  illumination 
wavelength  X  [17,4].  Given  the  configuration  of  the  optical 
elements  in  this  work,  the  theoretical  minimum  resolva¬ 
ble  length  scale  is  approximately  4  pm  following  [4]. 
However,  the  resolution  achieved  in  practice  is  in  the 
range  of  8-10  pm  due  to  stray-light  noise  and  imperfec¬ 
tions  in  the  optical  design.  Fundamentally,  the  resolution 
of  this  holographic  configuration  will  not  exceed  what  is 
possible  from  a  conventional  optical  microscope.  How¬ 
ever,  as  discussed  earlier,  it  does  provide  the  substantial 
advantage  of  near  real-time,  in  situ,  and  high  through-put 
imaging,  which  is  not  typically  possible  with  conventional 
microscopy. 

3.  Apparatus  design  and  validation 

The  experimental  apparatus,  which  is  shown  in  Fig.  1, 
consists  of  two  primary  subsystems:  aerosol-particle 
sensing  and  hologram  recording.  An  aerosol  stream  is 
delivered  via  a  nozzle  made  from  a  plastic  pipettor-tip  to 
the  measurement  volume  where  an  optical  trigger  is  used 
to  sense  the  presence  of  a  particle  [21,22].  This  trigger 
consists  of  crossed  diode-laser  beams,  labeled  (h)  and  (i) 
in  Fig.  1.  These  lasers  have  different  wavelengths  of  635 
and  670  nm  and  intersect  near  the  outlet  nozzle  deliver¬ 
ing  the  aerosol.  When  a  particle  passes  into  this 


intersection  it  scatters  both  wavelengths  of  light  simulta¬ 
neously.  The  scattered  light  is  received  by  two  photo¬ 
multiplier  (PMT)  modules  (Hamamatsu  Corp.,  model 
H6780-02),  (j)  in  the  figure,  each  sensitive  to  only  one  of 
the  two  wavelengths.  A  series  of  signal-analysis  units 
determines  if  the  signals  produced  by  the  PMT  modules 
are  coincident.  If  so,  this  indicates  the  presence  of  a 
particle  at  the  trigger  laser-beam  intersection  and  a  fire 
signal  is  sent  to  a  pulsed  laser  for  the  hologram  recording. 

The  triggered  light  source  is  a  70  ns  pulsed  Nd:YAG 
laser  (Spectra  Physics  Lasers,  Inc.,  model  Y70-532Q), 
frequency  doubled  to  532  nm.  This  light  passes  through 
a  Glan-Thompson  polarizer  to  ensure  linear  polarization 
(a)  in  Fig.  1.  The  light  is  then  focused  by  lens  (b)  onto  a 
50  pm  diameter  pinhole  (c).  Next  the  primary  lobe  of  this 
pinhole  diffraction  pattern  illuminates  a  second  pinhole 
(d)  with  a  diameter  of  25  pm.  These  pinholes  “clean”  the 
beam  improving  its  spatial  coherence  and  enhancing  the 
quality  of  the  hologram.  All  but  the  primary  lobe  of  this 
second  pinhole  pattern  is  blocked  by  iris  (e)  where  lens  (f) 
then  collimates  the  beam,  which  is  brought  to  a  focus  by 
lens  (g)  at  a  point  approximately  2  mm  from  the  aerosol 
nozzle  outlet.  This  2  mm  is  the  distance  l  in  Section  2.  In 
this  way,  the  aerosol  particles  are  illuminated  by  what  is 
approximately  a  spherical  wave  originating  from  the 
beam  waist.  The  beam  continues  until  reaching  the  CCD 
detector  (Finger  Lakes  Instrumentation,  LLC,  model 
ML8300),  at  which  point  it  expands  to  fill  the  entire  pixel 
array  (5.4  pm  pixel  size,  3326  x  2504  pixel-array  size). 
The  separation  between  the  particle  stream  and  the 
detector  is  the  d  discussed  in  Section  2  and  is  approxi¬ 
mately  8  cm.  A  small  amount  of  the  beam  is  scattered  by 
the  particle  (dashed  line  in  Fig.  1 ),  and  this  light  interferes 
with  the  remainder  of  the  beam,  i.e.,  the  reference  wave, 
to  form  the  interference  pattern  that  becomes  the  digital 
hologram  /hol°. 

To  test  the  apparatus  and  provide  a  rough  calibra¬ 
tion  of  the  image-reconstruction  procedure,  a  compari¬ 
son  is  made  between  a  holographic  and  optical  micro¬ 
scope  image  of  the  same  particle.  This  is  done  by  placing 
15.4  pm  diameter  NIST-traceable  polystyrene  latex 
microspheres  (Duke  Scientific  Corp.)  on  a  microscope 
slide  and  positioning  the  slide  in  the  measurement 
volume  at  the  intersection  of  the  trigger-beams.  A  holo¬ 
gram  is  recorded,  from  which  the  image-reconstruction 
procedure  of  Section  2  is  followed.  The  slide  is  then 
transferred  to  a  microscope,  where  the  same  spheres  are 
located  and  imaged.  Next,  using  a  1951  USAF  glass-slide 
resolution  target  (Edmund  Optics),  a  scale  factor  is  deter¬ 
mined  relating  the  microscope-image  pixel  number  to 
micrometers.  Then,  by  comparing  the  holographic  image 
of  a  microsphere  to  the  microscope  image  of  the  same 
microsphere,  an  additional  scale  factor  is  determined 
relating  the  hologram  pixel  number  to  micrometers.  In 
this  way,  the  holographic  images  of  all  subsequent  parti¬ 
cles  can  be  rendered  in  calibrated  length  (micrometers), 
rather  than  pixel  number.  This  calibration  procedure  is 
approximate,  however,  because  there  is  ambiguity  in 
determining  the  hologram  pixel-number  size  of  a  given 
microsphere:  The  contrast  between  the  reconstructed 
sphere-image  and  the  background  is  not  sharp,  which 
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Fig.  1.  Diagram  of  the  apparatus.  The  middle  inset  shows  a  schematic  of  the 
a  particle  in  the  measurement  volume.  See  text  for  further  explanation. 


can  yield  a  scale  factor  that  over-  or  under-determines  the 
particle  size. 

An  example  is  presented  in  Fig.  2  demonstrating  the 
comparison  between  the  holographic  and  microscope 
images  of  the  same  particle.  Here  a  cluster  of  ragweed 
pollen  particles  is  placed  on  a  microscope  slide,  then 
holographic  and  microscope  images  of  the  cluster  are 
obtained.  By  comparing  these  images,  one  can  see  that  the 
holographic  apparatus  successfully  produces  an  accurate 
image  of  the  pollen  cluster,  with  sufficient  resolution  to 
discern  individual  pollen  particles  and  even  a  faint  sig¬ 
nature  of  the  single-particle  surface  roughness  seen  in  the 
microscope  images.  This  corresponds  to  a  resolution 
roughly  between  8-10  pm,  although  a  more  rigorous 
resolution  analysis  is  not  performed.  Referring  to  the 
measured  and  contrast  holograms  shown  in  this  figure, 
one  can  see  how  subtraction  of  the  incident  beam  across 
the  CCD,  i.e.,  Iref,  removes  noise  due  to  imperfections  in 


signal-analysis  electronics  used  in  the  optical  trigger  to  sense  the  presence  of 


the  incident  beam  profile.  This  has  the  consequence  of 
producing  a  “cleaner”  contrast  hologram,  which  subse¬ 
quently  improves  the  particle  image.  Note  that  the  holo¬ 
graphic  and  microscope  images  of  the  cluster  differ 
slightly  in  overall  size  and  detailed  structural  form. 
Although  it  is  clearly  the  same  cluster  in  (c)  and  (d),  the 
differences  are  likely  due  to  shifting  of  the  cluster  on  the 
microscope  slide  during  transfer  from  the  apparatus  to 
the  microscope. 

There  are  several  unique  aspects  to  the  design  of  this 
apparatus.  By  using  the  short  focal-length  lens  (g)  in  Fig.  1 
to  form  a  beam  waist  near  the  particle,  the  light  illumi¬ 
nating  the  particle  is  more  intense  than  it  would  be  if 
only  the  pinhole  was  used  for  illumination  (as  is  usually 
done).  This  results  in  a  relative  amplification  of  the 
scattered  wave  at  the  detector  and  enhances  the  inter¬ 
ference  structure  of  the  hologram  leading  to  improved 
particle-image  quality.  Using  a  pulsed  laser  permits  the 
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Fig.  2.  Validation  of  the  holographic  imaging  apparatus.  Plots  (a)  and  (b)  show  the  measured  Ihol°  (digital)  and  corresponding  contrast  Jcon  holograms, 
respectively,  for  a  cluster  of  ragweed  pollen  particles  on  a  microscope  slide  located  at  the  intersection  of  the  trigger-beams,  recall  Fig.  1.  Image  (c)  shows 
the  reconstructed  image  resulting  from  (b)  whereas  (d)  shows  a  conventional  microscope  image  of  the  same  cluster. 


investigation  of  particle  systems  in  motion.  This  also 
greatly  relaxes  the  strict  mechanical-stability  demands 
typically  required  for  holographic  measurements.  There 
are  no  optical  elements  between  the  aerosol  stream  and 
the  CCD.  This  gives  the  apparatus  a  working  distance  of 
several  centimeters,  which  is  substantially  greater  than 
the  single-  to  sub-millimeter  working  distance  of  a 
microscope  objective.  Moreover,  the  absence  of  any  opti¬ 
cal  elements  between  the  detector  and  particle  eliminates 
“noise”  resulting  from  ambient  dust  that  can  collect  on 
the  optical  surfaces. 

4.  Applications 

To  further  assess  the  imaging  capabilities  of  the 
apparatus,  several  aerosols  consisting  of  complex-shaped 
particles  are  examined.  The  first  samples  are  sieved 
Saharan  and  Tunisian  sand,  which  are  aerosolized  using 
an  Erlenmeyer  flask  as  follows:  A  small  sample  of  the  sand 
is  placed  in  the  flask,  then  sealed  with  a  stopper.  Two 
aluminum  tubes  pass  through  the  stopper;  one  supplies 
air  to  the  flask,  blowing  the  sand  particles  around,  while 
the  other  tube  allows  some  of  the  airborne  particles  to 
exit  the  flask  and  be  transported  to  the  aerosol  nozzle  in 
the  apparatus.  Fig.  3  shows  the  contrast  holograms  along 
with  the  resulting  particle-image  reconstructions  for  single 


Saharan  and  Tunisian  sand  particles.  For  comparison,  Fig.  4 
shows  microscope  images  of  these  sand  samples.  One  can 
see  that  the  holographic  images  provide  the  same  informa¬ 
tion  of  overall  particle  size  and  morphology  as  the  micro¬ 
scope  images.  For  example,  the  Saharan  particles  appear  to 
have  less  surface  roughness  than  the  Tunisian  particles. 
Note  that  unlike  Fig.  2,  the  particles  shown  in  the  holo¬ 
graphic  reconstructions  (Fig.  3)  and  microscope  images 
(Fig.  4)  are  not  the  same  sand  particles  since  the  holographic 
images  are  obtained  from  flowing  particles. 

Another  unique  capability  of  holographic  imaging  is 
that  some  sense  of  the  three-dimensional  form  of  a 
particle  can  be  garnered  from  a  single  measurement. 
The  basic  idea  is  analogous  to  the  “focusing  in”  on  a 
particle  in  conventional  microscopy.  There,  the  micro¬ 
scope  objective  is  moved  vertically  to  vary  the  distance 
between  it  and  the  microscope  slide,  causing  a  blurred 
image  of  a  particle  to  evolve  into  a  sharp  image.  If  the 
particle  has  sufficient  thickness  and  transparence,  differ¬ 
ent  depths  within  the  particle  can  be  brought  into  focus  to 
give  a  feel  for  the  particle’s  three-dimensional  structure. 
This  same  process  can  be  done  in  digital  holography  by 
computationally  varying  the  distance  d  used  in  the  image- 
reconstruction  stage,  as  is  shown  in  [3].  The  resulting 
sequence  of  images  gives  the  same  impression  of  focusing 
in  on  the  particle  as  one  gets  from  microscopy.  However, 
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Fig.  3.  Saharan  and  Tunisian  sand  particles.  Images  (a)  and  (b)  show  the  contrast  hologram  Icon  and  corresponding  reconstructed  image  for  a  single 
Saharan  sand  particle.  Images  (c)  and  (d)  show  the  same  for  a  single  Tunisian  sand  particle. 
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Fig.  4.  Microscope  images  of  (a)  Saharan  and  (b)  Tunisian  sand.  The  particles  seen  here  are  taken  from  the  same  sand  samples  used  in  Fig.  3,  but  unlike 
the  ragweed  in  Fig.  2,  these  particles  are  not  the  exact  same  particles  imaged  holographically. 


unlike  microscopy  where  an  image  must  be  recorded 
at  each  “focus  depth,”  the  holographic  route  can  obtain 
a  similar  image-sequence  from  the  (single)  contrast 
hologram  only. 

Fig.  5  shows  an  example  of  this  holographic  focusing 
process.  The  top  row  displays  conventional  microscope 
images  of  a  NaCl  crystal  at  different  focus  depths.  The 


bottom  row  shows  a  holographic  image-sequence  for  an 
aerosolized  NaCl  particle  that  is  produced  by  scanning 
the  reconstruction  plane  along  the  z-axis  around  z=d.  The 
particle  in  the  holographic  images  is  delivered  to  the 
apparatus  in  aerosol  form  by  drying  a  salt  solution  on  a 
hotplate  and  aerosolizing  the  resulting  powder  using  the 
Erlenmeyer  generator  described  in  Section  3.  One  can 
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Fig.  5.  Focusing  behavior  of  the  holographic  image-reconstruction  process.  The  top  row  shows  microscope  images  of  a  NaCl  crystal  on  a  microscope  slide 
at  three  different  focus  depths  (a)-(c).  The  bottom  row  shows  the  reconstructed  images  of  a  NaCl  aerosol  particle  when  the  reconstruction  plane  is  at 
three  positions  for  z:  z<  d  for  (a),  z=d  for  (b),  i.e.,  in-focus,  and  z>  d  for  (c). 


clearly  see  the  strong  similarity  in  the  focusing  behavior 
of  the  two  imaging  techniques. 

5.  Comments 

The  in  situ  images  of  aerosol  particles  presented  here 
are  not  the  only  documented  examples.  Sorensen  et  al. 
have  obtained  images  of  the  particles  constituting  hydro¬ 
carbon-flame  soot  at  various  stages  in  soot  formation,  i.e., 
as  a  function  of  height  in  a  flame  [23,24].  Here  a  10  x  - 
power  photomicroscope  is  mated  to  a  conventional  film- 
camera  and  a  1.5  ps  Xe  flash  lamp  is  used  for  particle 
illumination.  With  this  arrangement,  particles  in  the 
range  of  roughly  5-100  pm  are  imaged,  which  covers 
the  same  particle  size  range  considered  in  our  work.  One 
might  then  wonder  what  advantage  the  holographic 
approach  offers  over  this  photomicroscope  direct-imaging. 

First,  the  photomicroscope  images  are  obtained  photo¬ 
graphically,  i.e.,  using  film,  requiring  chemical  processing. 
The  holograms,  however,  are  entirely  digitally  recorded 
and  the  resulting  images  are  computationally  rendered. 
Second,  and  perhaps  most  important,  the  photomicro¬ 
scope  images  have  a  very  narrow  depth  of  field,  and  only 
particles  constrained  within  a  narrow  volume  are  in¬ 
focus;  whereas,  for  holographic  techniques  the  focusing 
is  done  computationally,  after  the  hologram  is  recorded. 
This  enables  the  focusing  process  described  in  Section  4, 
which  can  be  used  to  image  multiple  particles  present  at 


different  locations  in  the  measurement  volume  as  demon¬ 
strated  in  [3].  Moreover,  this  can  be  done  from  a  single 
hologram  recording.  To  do  this  with  the  photomicroscope 
would  require  obtaining  a  series  of  exposures  with  the 
microscope  objective  positioned  at  different  distances 
from  the  measurement  volume.  Thus,  if  the  particles  are 
in  motion,  as  they  are  in  flow-through  applications,  a 
series  of  exposures  would  prevent  the  imaging  of  multiple 
particles  present  at  a  given  instant  in  the  measurement 
volume. 

As  mentioned  in  Section  3,  an  inherent  advantage  of 
the  holographic  design  is  that  there  are  no  optical  ele¬ 
ments  between  the  particle  and  detector.  Thus,  there  are 
no  surfaces  for  ambient  dust  to  collect  on  and  become 
sources  of  stray  light,  nor  are  there  any  lens-based 
aberrations  and  multiple  reflections.  Both  of  these  con¬ 
cerns  are  present  in  the  photomicroscope  approach.  The 
absence  of  these  optical  elements  in  the  holographic 
design  is  especially  advantageous  when  one  wishes  to 
investigate  particles  that  are  roughly  the  same  size  as 
ambient  dust. 

6.  Conclusion 

This  work  demonstrates  the  feasibility  of  imaging 
single  and  multiple  aerosol  particles  in  situ  using  digital 
in-line  holography.  Imaging  is  demonstrated  on  ragweed 
pollen,  Saharan  and  Tunisian  sand,  and  NaCl  particles;  a 
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range  of  overall  particle-size  covering  approximately 
15-500  pm.  These  images  are  computationally  recon¬ 
structed  from  the  digitally  recorded  holograms  and  compare 
well  to  the  corresponding  microscope  images.  Although  the 
resolution  of  the  holographic  images  is  less  than  those  from 
the  microscope,  one  is  able  to  clearly  discern  single-particle 
size  and  shape.  Moreover,  the  ability  to  computationally 
render  the  images  allows  the  application  of  numerical 
operations  to  improve  image  quality,  whereas  the  analogs 
of  such  operations  in  conventional  optical  imaging  would  be 
difficult  to  implement. 
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Abstract — We  describe  a  computational  imaging  technique  to 
extend  the  depth-of-field  of  a  94-GHz  imaging  system.  The  tech¬ 
nique  uses  a  cubic  phase  element  in  the  pupil  plane  of  the  system 
to  render  system  operation  relatively  insensitive  to  object  distance. 
However,  the  cubic  phase  element  also  introduces  aberrations  but, 
since  these  are  fixed  and  known,  we  remove  them  using  post-de¬ 
tection  signal  processing.  We  present  experimental  results  that 
validate  system  performance  and  indicate  a  greater  than  four-fold 
increase  in  depth-of-field  from  17”  to  greater  than  68”. 

Index  Terms — Computational  imaging,  extended  depth  of  field, 
millimeter  wave  imaging. 


I.  Introduction 

THE  ability  of  gigahertz  and  terahertz  frequencies  to  pene¬ 
trate  materials  that  are  impenetrable  at  optical  frequencies 
has  prompted  recent  interest  in  the  development  of  millimeter 
wave  sources  and  detectors  [1].  Applications  of  this  capability 
include,  for  example,  the  detection  of  concealed  weapons  under 
clothing  [2],  [3].  However,  unlike  the  stationary  figures  shown 
in  Fig.  1,  a  more  typical  scenario  for  this  application  is  screening 
individuals  at  points  of  ingress,  such  as  the  entrance  to  a  building 
or  the  secured  portion  of  an  airport.  To  enhance  the  performance 
of  screening  systems,  one  would  prefer  to  observe  individuals 
as  long  as  possible  as  they  pass  through  a  volume.  This  im¬ 
proves  the  chances  of  detecting  a  hidden  object.  (It  might  also 
reduce  bottlenecks  created  at  portals.)  However,  wavelength  and 
system  considerations  limit  focused  imaging  to  only  a  narrow 
volume  in  depth,  or  depth-of-field.  Thus,  a  screener  has  only 
a  short  amount  of  time  to  detect  the  presence  or  absence  of  a 
concealed  weapon.  Extending  the  depth-of-field  provides  the 
screener  with  more  time  to  observe  an  individual. 

A  similar  problem  occurs  in  iris  recognition  for  security 
applications,  for  example,  logging-on  to  a  computer  system.  In 
this  situation  the  narrow  depth  of  field  produces  unnatural  head 
movements  as  a  user  seeks  to  place  his  or  her  iris  in  the  object 
plane  of  the  optical  system.  Extending  the  depth-of-field  for 
these  systems  has  been  addressed  using  computational  imaging 
techniques  [4]— [7].  By  computational  imaging,  we  mean  an 
imaging  system  whose  pre-detection  optics  and  post-detection 
signal  processing  are  designed  jointly  to  achieve  a  result  that 
is  not  possible  using  only  optics  or  only  signal  processing  [8], 
[9].  For  example,  placing  an  optical  element  with  cubic-phase 
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Fig.  1.  Gigahertz  imaging  through  clothing,  (a)  Visible  image  of  scene,  (b) 
94-GHz  image  of  weapons  concealed  under  clothing. 


parabolic  primary 


fiO  (b> 


Fig.  2.  94-GHz  scanning  imaging  system,  (a)  Image  of  system,  (b)  Schematic 
representation  with  measured  dimensions. 

in  the  pupil  plane  of  the  optical  system  renders  system  op¬ 
eration  relatively  insensitive  to  object  distance.  However,  the 
cubic  phase  also  generates  an  aberrated  image.  But,  since  the 
aberrations  are  known,  one  can  correct  them  using  simple 
post-detection  signal  processing.  Since  the  system  response 
is  effectively  invariant  to  object  location,  the  combination  of 
optical  and  electronic  processing  yields  a  system  with  larger 
depth-of-field  than  a  conventional  system. 

In  this  work  we  describe  our  application  of  this  technique 
for  extended  depth-of-field  imaging  to  a  94-GHz  system 
and  present  experimental  results  to  verify  its  performance. 
In  Section  II  we  describe  our  imaging  system  and  present  a 
mathematical  description  of  its  operation  with  and  without 
extended  depth-of-field.  We  describe  design  and  fabrication  of 
the  cubic-phase  element  in  Section  III  and  present  experimental 
results  in  Section  IV.  Section  V  discusses  the  signal  processing 
required  by  the  cubic  phase  system  to  realize  the  extended 
depth  and  presents  the  results  from  this  processing.  We  end  in 
Section  VI  with  summary  remarks  on  our  approach. 

II.  94-GHz  Imaging  System 

Our  imaging  system,  represented  in  Fig.  2,  is  a  94-GHz 
Stokes-vector  radiometer  used  for  millimeter  wave  phe¬ 
nomenology  measurements  [10].  It  is  a  single-beam  system 
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that  forms  an  image  by  scanning  in  azimuth  and  elevation. 
The  radiometer  has  a  thermal  sensitivity  of  0.3  K  with  a  30-ms 
integration  time  and  1-GHz  bandwidth  per  pixel.  A  Cassegrain 
antenna  is  mounted  to  the  front  of  the  radiometer  receiver.  It  has 
a  24”-diameter  primary  parabolic  reflector  and  a  1 .75”-diameter 
secondary  hyperbolic  reflector.  The  position  of  the  hyperbolic 
secondary  is  variable. 

If  we  model  the  94-GHz  imager  as  a  linear,  spatially  inco¬ 
herent,  quasi-monochromatic  system,  the  intensity  of  the  de¬ 
tected  image  can  be  represented  as  a  convolution  between  the 
intensity  of  the  image  predicted  by  geometrical  optics  with  the 
system  point  spread  function  [11] 


\i(x,y)\2 -og(x,y)**h(x,y)  (1) 

where  **  represents  a  two-dimensional  convolution.  The  func¬ 
tion  og  ( x ,  y)  represents  the  inverted,  magnified  image  of  the  ob¬ 
ject  that  a  ray-optics  analysis  of  the  system  predicts 


og(x,y) 


1 

M2 


-X  -y 
M’  M 


(2) 


If  the  object  and  image  distances  are  dQ  and  di,  respectively, 
the  magnification  M  is 


M=f-.  (3) 

do 

For  the  purposes  of  geometrical  analysis,  we  can  model  the 
system  as  a  single  lens  imaging  system  with  d{  —  6"  (152.4 
mm) 


1  1  _  1 
d0  +  di  /' 


(4) 


The  value  of  dt  is  based  on  measurements  of  the  antenna.  We 
adjusted  the  position  of  hyperbolic  element  so  that  nominal  op¬ 
eration  of  the  imager  is  with  dQ  —  180"  (4572  mm).  Thus,  the 
effective  focal  length  of  the  system  is  f  —  5.81"  (147.6  mm). 

The  second  term  in  (1),  h(x,  y),  is  the  incoherent  point  spread 
function  (PSF).  It  accounts  for  wave  propagation  through  the 
aperture 


h(x,y) 
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where  p(x /  A/,  y/Xf)  is  the  coherent  point  spread  function.  The 
function  p(x,  y)  is  the  inverse  Fourier  transform  of  the  system 
pupil  function  P(u,v) 


As  a  consequence,  the  optical  transfer  function  (OTF)  H(u,  v) 
associated  with  the  PSF  is  the  autocorrelation  of  the  pupil  func¬ 
tion  P(u,v)  with  frequency  axes  scaled  by  Xf 


H (■ u ,  v)  =  P(Xfu,  X fv)  *  *P(Xfu,  X fv)  (7) 

where  **  represents  two-dimensional  correlation.  For  example, 
for  a  circular  aperture  of  diameter  D 
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Ray  analysis  of  our  system  confirmed  that  the  parabolic  primary 
forms  the  aperture  stop,  i.e.,  it  defines  the  location  of  the  pupil 
plane. 

Displacement  de  of  an  object  from  the  nominal  object  plane 
introduces  a  phase  error  0e(u,  v)  in  the  pupil  function  [11] 


Pe(u,v)  —  exp  [—j6e(u,v)\  circ 
where 

9,(«,„)=g)  <“2  +  ,’7  (4^ 
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The  phase  error  increases  the  width  of  a  point  response.  If  the 
displacement  and  phase  errors  are  small,  the  detector  (either 
human  or  machine)  may  be  unable  to  resolve  the  increase  and 
the  image  is  perceived  as  in-focus. 

The  distance  in  object  space  over  which  an  object  can  be 
placed  and  still  produce  an  in-focus  image  is  the  system’s 
depth-of-field  DoF 


DoF  —  dQ+  —  d0-  (12) 

where  formulas  for  d0+  and  dQ~  depend  upon  system  applica¬ 
tion.  Many  different  definitions  exist.  For  demonstration  pur¬ 
poses,  we  use  a  conventional  definition  based  on  the  spatial  ex¬ 
tent  6  of  the  point  response  [12] 


and 


(13) 


p(x,y)~  FT  1[P(u,v)]. 


(6) 


(14) 
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Fig.  3.  Maximum  relative  pupil  phase  error  as  a  function  of  object  distance. 
The  shaded  region  indicates  a  conventional  depth  of  field.  The  discrete  points 
indicate  object  distances  used  in  experiments. 


Under  the  assumptions  that  6  is  defined  by  the  Rayleigh  criteria, 
the  imager  operates  in  the  far-field  (dQ  /),  and  the  lens  aper¬ 
ture  is  large  compared  to  a  wavelength 


DoF*( 2AA\)(PSj  .  (15) 

For  a  94  GHz-imager  with  D  —  24"  and  dQ  —  180",  DoF  « 
17.4"  which  ranges  from  175.2”  to  192.6”.  Fig.  3  indicates  the 
maximum  relative  phase  error  as  a  function  of  object  distance 
and  also  indicates  the  region  we  have  defined  as  being  in  the 
depth-of-field.  (The  maximum  error  for  a  given  plane  occurs  at 
the  edge  of  the  aperture,  u2  +  v2  —  D2 / 4.) 

Equation  (15)  explains  mathematically  what  any  good  pho¬ 
tographer  already  knows:  one  can  increase  DoF  by  decreasing 
(“stopping  down”)  the  lens  aperture  D.  However,  this  reduces 
throughput  and  degrades  the  diffraction  limited  resolution. 
Alternatively,  it  has  been  shown  at  optical  wavelengths  that  a 
cubic-phase  element  placed  in  the  pupil  plane  of  an  imaging 
system  in  combination  with  post-detection  processing  can  also 
increase  DoF  but  without  sacrificing  either  throughput  or 
resolution. 

The  cubic  phase  element  Pc{u,  v)  is 

(u  v  \ 

— 1  (16) 


where 


Oc(u,v)  =  (na) 


(17) 


The  phase  function  is  separable  in  the  u-  and  i;-spatial  frequen¬ 
cies  and  has  spatial  extent  Wu  and  Wv  along  the  respective  axes. 
The  constant  a  represents  the  strength  of  the  cubic  phase.  Along 
one  axis  the  total  phase  change  is  2ira ;  the  phase  change  across 
a  diagonal  is  47 ra.  In  the  simulations  presented  below,  we  mod¬ 
ified  the  model  for  Pc(u,  v)  slightly  and  included  an  appropri¬ 
ately  sized  central  obscuration  to  account  for  the  effect  of  rays 
blocked  by  the  secondary  mirror. 
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Fig.  4.  Modulation  transfer  function  for  systems  with  different  strength  cubic 
phase:  (a)  a  =  0,  (b)  cv  =  7,  and  (c)  a  —  20.  The  MTFs  for  four  different 
object  distances  are  represented:  180”  (solid  line),  163”  (dashed  line),  146.5” 
(dot-dashed  line),  and  113”  (dotted  line). 
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Fig.  5 .  Simulated  point  spread  functions  for  conventional  imaging  and  imaging 
with  a  cubic  phase.  Simulated  PSFs  for  conventional  imaging  system  at  (a)  180”, 
(b)  146.5”,  and  (c)  113”.  (d)-(f)  Simulated  PSFs  for  an  imaging  system  with 
cubic  phase  at  the  same  object  distances  as  (a)-(c). 


Representations  of  the  magnitude  transfer  function  (MTF), 
the  magnitude  of  the  OTF  \H(u,v)\,  in  cross-section  are  repre¬ 
sented  in  Fig.  4  for  three  different  values  of  cn  =  {0,7,  20}  and 
different  values  of  misfocus.  (The  misfocused  planes  are  located 
roughly  at  1,  2,  and  4  times  the  DoF.  The  values  noted  in  the 
figure  are  measured  distances  used  in  our  experiments.)  A  con¬ 
ventional  system  with  no  cubic  phase  ( a  =  0)  is  represented  in 
Fig.  4(a).  Note  that  the  MTFs  differ  for  each  value  of  misfocus. 
Compensating  for  misfocus  therefore  requires  a  priori  knowl¬ 
edge  of  where  an  object  is  located.  Even  if  this  information  were 
known,  due  to  the  presence  of  zeros  in  the  MTFs,  inverting  any 
one  of  them  is  ill-posed  and  will  generate  noisy  results. 

In  contrast,  the  MTFs  for  cubic  phase  elements  with  non-zero 
values  of  a  are  relatively  constant  over  an  extended  range.  Note 
that  the  larger  the  value  of  a  the  larger  the  range  over  which 
the  system  is  insensitive  to  object  location.  However,  increasing 
a  reduces  the  magnitude  of  the  MTF,  which  is  detrimental  for 
applications  with  low  signal  to  noise  ratios.  But,  because  the 
MTFs  do  not  contain  any  zeros,  their  inversion  is  better  condi¬ 
tioned  than  the  MTFs  for  a  conventional  system. 

Simulations  of  the  point  spread  functions  one  can  expect  from 
our  imaging  system  with  and  without  a  cubic  phase  element 
with  a  =  7  are  represented  in  Fig.  5.  The  response  of  the  cubic 
phase  system  is  relatively  unchanged,  whereas  the  response  of 
the  conventional  system  changes  considerably.  We  address  in 
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Fig.  6.  Representation  of  the  process  for  converting  the  cubic  phase  to  a  surface 
depth  profile.  Only  a  one-dimensional  representation  is  shown  but  extensions  to 
two-dimensions  are  straightforward. 


Section  V  the  processing  required  by  the  cubic  phase  system  to 
produce  a  well-defined  spot. 

III.  Cubic  Phase  Design  and  Fabrication 

We  fabricated  a  cubic  phase  element  with  a  =  7  from  Rex- 
olite  using  a  3 -axis  computerized  numerical  control  router.  The 
router  has  a  0.1969”-minimum  feature  size  (5  mm)  and  provides 
0.0002”  (5  ftm)  position  accuracy.  Element  fabrication  was  a 
two-step  process. 

In  the  first  step  we  machined  a  continuous  surface  profile  by 
sampling  the  cubic  phase  0c(u,v),  converting  phase  to  depth, 
and  using  a  cubic  spline  to  insure  a  smooth  transition  between 
depth  samples.  The  phase  samples  were  generated  according  to 


L—  liW—  1  /  TJ  J  TTT 


c  f  m  l 
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We  used  a  sampling  distance  A  =  0.1"  (2.54  mm)  to  insure 
overlapping  features.  Given  Wu  —  Wv  —  24",  L  —  M  —  240. 
Phase  values  were  converted  to  depth  d(  m  using 


dfrn  — 


27 r(n  —  1) 


[0£m  modulo  2tt]  . 


The  modulo- 27t  operator  limits  the  phase  to  only  a  single  wave¬ 
length  and  reduces  element  weight.  See  Fig.  6.  The  diffractive 
characteristics  introduced  in  this  conversion  have  little  effect  on 
the  response  of  the  element  [5],  [6].  At  94  GHz  Rexolite  has  a 
refractive  index  n  =  1.59;  thus,  a  depth  change  of  0.2129”  (5.4 
mm)  in  the  material  generates  a  27r-phase  change  in  the  wave- 
front.  The  second  step  sharpened  the  edges  at  phase  discontinu- 


(a)  (b)  (c> 


Fig.  7.  Fabricated  cubic  phase  element,  (a)  Side-view  and  (b)  front-view  of 
cubic  phase  element  mounted  to  imaging  system,  (c)  Detail  of  fabricated  ele¬ 
ment.  The  region  displayed  is  in  the  lower  right  of  the  phase  element,  which  is 
highlighted  in  (b). 
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Fig.  8.  Measured  point  spread  functions  for  conventional  imaging  and  imaging 
with  a  cubic  phase.  PSFs  for  conventional  imaging  system  at  (a)  180”,  (b)  146.5” 
and  (c)  113”.  (d)-(f)  PSFs  for  a  system  with  cubic  phase  at  the  same  object 
distances  for  (a)-(c). 


ities  to  within  <0.1  mm.  The  final  element  is  shown  in  Fig.  7 
mounted  to  the  antenna. 

IV.  Experimental  Resufts 

To  validate  the  performance  of  the  cubic  phase  element  to 
extend  DoF ,  we  measured  the  PSF  of  a  conventional  system 
and  the  cubic-based  system  at  three  distances,  113”  (2870 
mm),  146.5”  (3721  m),  and  180”  (4572  mm).  Since  the  DoF 
is  asymmetric  with  respect  to  the  object  plane  and  collapses 
more  quickly  as  the  object  plane  moves  toward  the  system,  we 
measured  only  this  behavior.  The  out-of-focus  object  planes 
correspond  to  displacements  that  are  twice  and  four  times  the 
calculated  DoF.  We  also  imaged  an  extended  object  at  the 
same  distances  using  both  systems. 

To  measure  the  PSF  we  imaged  a  point  source  generated  by 
an  open  waveguide  with  dimensions  0.050”  x  0.100”.  Given 
that  the  operating  frequency  was  94  GHz,  the  aperture  in  wave¬ 
lengths  is  0.40A  x  0.80A.  The  output  power  was  - 14  dBm.  The 
results  are  represented  in  Fig.  8.  The  experimental  results  agree 
qualitatively  with  the  simulations  presented  in  Fig.  5.  The  fig¬ 
ures  are  normalized  to  the  peak  value  measured,  which  occurs 
in  Fig.  8(a). 

The  extended  object  used  in  our  experiments  is  represented 
in  Fig.  9(a).  The  spoke  pattern  produces  50-50  square  waves 
whose  frequencies  vary  linearly  from  low  values  at  the  circum¬ 
ference  to  high  values  in  its  center.  Given  that  the  pattern  con- 
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Fig.  9.  (a)  Representation  of  the  extended  object  used  to  compare  conventional 
and  cubic-phase  imaging,  (b)  Schematic  of  object  illumination. 


TABLE  I 

Parameters  for  Extended  Object  Imaging 


Fig.  10.  Images  of  an  extended  object  for  conventional  imaging  and  imaging 
with  a  cubic  phase.  Images  from  a  conventional  imaging  system  at  (a)  180”,  (b) 
146.5”  and  (c)  113”.  (d)-(f)  Images  from  a  system  with  cubic  phase  at  the  same 
object  distances  for  (a)-(c). 
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0.0334 

9.7127 

139.4409 

0.5909 

146.5 

0.0413 

7.8448 

112.6238 

0.4772 

113.0 

0.0542 

5.9768 

85.8067 

0.3636 

tains  8  periods  within  one  rotation,  the  period  S  of  the  imaged 
square  wave  as  a  function  of  radius  r  is 


S  = 


2irrM 

8 


(20) 


ambient  room  temperature  and  the  temperature  of  the  liquid  ni¬ 
trogen  reflected  through  the  cut-out.  The  images  captured  by  the 
system  are  represented  in  Fig.  10. 

Note  that  the  MTF  of  the  conventional  system  produces  im¬ 
ages  with  significant  high  frequency  loss.  In  contrast,  the  entire 
band  of  frequencies  between  pm jn  and  pc  can  be  seen  in  the  im¬ 
ages  captured  using  a  cubic  phase  element.  Even  without  signal 
processing  these  images  retain  more  discernable  characteristics 
of  the  spoke  target  than  the  image  from  the  conventional  system. 


and  the  corresponding  spatial  frequency  p  is 


V.  Post-Detection  Signal  Processing  and  Results 


P  = 


1 

SXf • 


Removing  the  artifacts  of  the  aberrations  introduced  by 
^ 2i )  the  cubic  phase  element  requires  post-detection  electronic 

processing.  We  assumed  a  linear  process 


The  maximum  radius  rmax  =  5. 38"  and  the  minimum  rm[n  — 
0.38".  Since  the  cutoff  frequency  for  a  circular  aperture  pc  is 


iP(x,y)  =  \i(x,y)\ 2  *  *w(x,y) 


(24) 


Pc  = 


D 
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the  corresponding  radius  rc  is  determined  by  equating  (21)  and 

(22) 


and  implemented  w{x,y)  as  a  Wiener  filter  in  Fourier  space 


W(u,  v) 


H*c(u,v) 


\Hc(u,v)\*  + 
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Radii  within  the  range  rmax  >  r  >  rc  generate  frequencies 
in  the  passband  of  the  optical  system.  Higher  frequencies  con¬ 
tained  in  the  region  rc  >  r  >  rmin  are  cutoff.  Table  I  lists 
the  magnification,  minimum  and  maximum  spatial  frequencies, 
and  the  cutoff  radius  as  a  fraction  of  the  maximum  radius  for 
the  three  object  distances  used  in  the  experiments.  We  note  that 
beyond  304”  the  system  is  incapable  of  resolving  the  target  at 
all. 

The  extended  object  was  generated  by  placing  a  metal  plate 
cut-out  of  Fig.  9(a)  in  front  of  a  metal  reflector  angled  at  45° 
to  a  bath  of  liquid  nitrogen.  See  Fig.  9(b).  This  arrangement 
produces  a  contrast  between  the  surrounding  metal  reflecting 


The  optical  transfer  function  (OTF)  Hc(u,v)  associated  with 
the  cubic  phase  element  was  estimated  from  the  experimentally 
measured  point  response  images.  The  parameter  K  is  a  mea¬ 
sure  of  the  signal-to-noise  ratio.  The  functions  $l(u,v)  and 
4>  at  (u,  v)  are  the  expected  power  spectra  of  the  object  and  noise, 
respectively.  Research  has  shown  that  Wiener  power  spectra 
are  good  assumptions  for  natural  scenes  [13].  We  adjusted  the 
mean  spatial  detail  parameter  to  produce  restored  PSFs  with 
widths  comparable  to  that  of  the  experimentally  measured  fo¬ 
cused  point.  Further,  we  assumed  a  flat  noise  spectrum  with 
K  =  50. 

Reconstructed  PSFs  are  represented  in  Fig.  11.  In 
Fig.  1 1(a)— (c),  the  cubic  OTF  was  estimated  from  the  PSF 
measured  experimentally  at  146.5”  and  the  subsequent  re¬ 
construction  filter  applied  to  all  the  images.  In  Fig.  1 1(d)— (f). 
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(a)  (b)  (c) 


{d>  (c)  (0 


Fig.  11.  Processed  point  responses  from  a  system  with  cubic  phase.  Images 
processed  using  the  PSF  for  dQ  —  146. 5  "  at  (a)  180”,  (b)  146.5”  and  (c)  113”. 
(d)-(f)  Images  processed  using  the  PSF  for  d0  —  ISO''. 


Fig.  12.  Processed  images  from  a  system  with  cubic  phase.  Images  processed 
using  the  PSF  for  d0  —  146.5"  at  (a)  180”,  (b)  146.5”  and  (c)  113”.  (d)-(f) 
Images  processed  using  the  PSF  for  d0  —  180". 


the  cubic  OTF  was  estimated  from  the  experiment  PSF  at 
180”.  From  Fig.  4,  data  collected  at  113”  has  significant  high 
frequency  loss  in  comparison  to  that  collected  at  both  146.5” 
and  180”.  We,  therefore,  did  not  use  this  data  to  create  a 
reconstruction  filter. 

In  both  instances  the  best  reconstruction  occurs  for  the  image 
matched  to  the  filter  and  the  noisest  reconstruction  is  at  113”, 
which  one  would  expect.  Nonetheless,  the  reconstructions  of 
the  point  response  are  comparable  in  terms  of  spatial  scale  to 
the  in-focus  response  of  the  conventional  system.  Compare 
Fig.  1 1(a),  (b),  (d),  and  (e)  to  Fig.  8(a).  Thus,  we  have  extended 
the  region  over  which  the  system  generates  a  diffraction-limited 
spot  from  5”  in  front  of  the  focal  plane  to  34”.  Since  we  expect 
similar  behavior  for  objects  beyond  the  focal  plane,  the  depth 
of  field  has  been  expanded  in  that  direction  by  at  least  34”  as 
well  but  we  expect  considerably  more. 

This  behavior  is  reflected  also  in  the  reconstruction  of  the  ex¬ 
tended  object  represented  in  Fig.  12.  For  both  reconstructions, 
not  only  are  the  images  at  180”  comparable  to  the  focused  image 
from  the  conventional  system  but  the  images  at  146.5”  are  com¬ 
parable  as  well.  In  comparison  to  the  image  at  146.5”  from  the 
conventional  system,  the  reconstructed  images  display  higher 
contrast  and  higher  resolution.  Note  especially  the  reflection  off 


the  metal  plate  on  the  right-hand  side,  which  is  clearly  visible  in 
the  reconstructions  but  is  apparent  only  in  the  focused  conven¬ 
tional  image  in  Fig.  10. 


VI.  Summary 

We  applied  a  computational  imaging  technique  for  extending 
depth-of-field  at  optical  frequencies  to  a  millimeter  wave 
imaging  system.  The  technique  requires  inserting  a  cubic 
phase  element  in  the  pupil  plane  of  an  imaging  system  and 
subsequent  post-detection  signal  processing.  We  designed  and 
fabricated  the  cubic  phase  element  in  Rexolite  and  validated  its 
performance  experimentally. 

In  some  applications,  a  priori  range  information  can  be  used 
to  improve  estimated  PSFs  and  thereby  improve  restoration. 
Further,  non-linear  image  restoration  techniques  incorporating 
a  priori  knowledge  of  the  scene  can  improve  restoration  relative 
to  linear  restoration. 

A  critical  difference  between  the  performance  of  millimeter 
wave  imaging  systems  and  imaging  systems  for  optical  and 
infrared  wavelengths  is  the  underlying  phenomonology  and 
availability  of  technology,  especially  detector  arrays.  Millimeter 
wave  systems  image  temperature  contrasts.  Careful  analysis 
of  noise  and  contrast  in  such  systems  is  necessary  to  assess 
the  impact  of  inserting  an  element  into  the  optical  train  whose 
amplitude  transfer  function,  although  flat,  is  relatively  low.  A 
more  in-depth  analysis  should  also  consider  the  coherence  and 
spectral  bandwidth  of  the  illumination. 

In  addition,  in  terms  of  practical  application,  one  needs  to 
consider  the  scale  of  the  optical  system  and  the  lack  of  large 
arrays  of  millimeter  wave  detectors  on  system  design.  Whereas 
one  can  design  an  optical  staring  imager,  millimeter  wave 
systems  will  continue  to  be  scanning  systems  until  detector 
array  technology  matures.  Even  so,  given  the  physical  con¬ 
straint  on  the  security  system  mentioned  in  the  introduction,  it 
is  unlikely  that  a  such  system  will  have  a  detector  array  larger 
than  200  x  200.  Nonetheless,  the  applicability  and  advantage  of 
computational  imaging  techniques  to  millimeter  wave  systems 
has  been  demonstrated. 
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We  demonstrate  coherent  combining  (phase  locking)  of  seven  laser  beams  emerging  from  an  adaptive  fiber- 
collimator  array  over  a  7  km  atmospheric  propagation  path  using  a  target-in-the-loop  (TIL)  setting.  Adaptive  control 
of  the  piston  and  the  tip  and  tilt  wavefront  phase  at  each  fiber-collimator  subaperture  resulted  in  automatic  focusing 
of  the  combined  beam  onto  an  unresolved  retroreflector  target  (corner  cube)  with  precompensation  of  quasi-static 
and  atmospheric  turbulence-induced  phase  aberrations.  Both  phase  locking  (piston)  and  tip-tilt  control  were  per¬ 
formed  by  maximizing  the  target-return  optical  power  using  iterative  stochastic  parallel  gradient  descent  (SPGD) 
techniques.  The  performance  of  TIL  coherent  beam  combining  and  atmospheric  mitigation  was  significantly 
increased  by  using  an  SPGD  control  variation  that  accounts  for  the  round-trip  propagation  delay  (delayed 
SPGD).  ©  2011  Optical  Society  of  America 
OCIS  codes:  140.3298,  010.1285,  010.1080. 


Coherent  combining  of  laser  beams  that  originate  from  a 
fiber-based  multichannel  master  oscillator  power  ampli¬ 
fier  (MOPA)  laser  system  at  a  remotely  located  target 
after  propagation  through  the  atmosphere  requires 
adaptive  compensation  of  both  random  phase  shifts  in¬ 
troduced  by  the  MOPA  system  and  atmospheric  turbu¬ 
lence-induced  phase  aberrations  [1,2].  Coherent  beam 
combining,  also  referred  to  as  phase  locking,  has  been 
demonstrated  in  several  laboratory-based  experiments 
(see,  e.g.,  [3-7])  and  over  a  408  m  long  distance  in  an  out¬ 
door  experiment  with  a  cooperative  target  [8]. 

In  this  Letter,  we  report  the  results  of  the  first  (to  our 
best  knowledge)  successful  coherent  beam  combining 
and  turbulence  mitigation  experiments  over  an  extended- 
length  atmospheric  propagation  path  in  a  target-in-the- 
loop  (TIL)  setting  with  a  noncooperative  target  using 
adaptive  control  of  the  piston  (subaperture-averaged 
phase)  and  tip  and  tilt  corrections  at  each  fiber-array  sub¬ 
aperture.  The  round-trip  propagation  delay  issue — a  ma¬ 
jor  obstacle  for  TIL  adaptive  optics  techniques — was 
overcome  by  utilizing  the  recently  proposed  “delayed” 
stochastic  parallel  gradient  descent  (SPGD)  wavefront 
control  technique  [9],  which  allowed  the  duration  be¬ 
tween  wavefront  control  updates  to  be  shorter  than 
the  round-trip  propagation  delay  and  resulted  in  a  signif¬ 
icant  increase  of  compensation  bandwidths. 

The  setup  used  in  the  experiments  (Fig.  1)  consists  of 
the  following  major  subsystems:  (i)  a  seven-channel 
master  oscillator  power  amplifier  (MOPA)  system  based 
on  single-mode,  polarization-maintaining  (PM)  fiber  ele¬ 
ments;  (ii)  a  fiber-collimator  array  with  built-in  capabil¬ 
ities  for  electronic  control  of  wavefront  phase  tip  and 
tilts  at  each  fiber-collimator  subaperture;  (iii)  an  unre¬ 
solved  target  (a  comer-cube  retroreflector)  located  at 
7  km  distance;  (iv)  a  receiver  telescope  for  measure¬ 
ments  of  the  target-return  optical  wave  power,  referred 
to  as  the  power-in-the-bucket  (PIB)  metric,  J ;  and  (v)  a 


control  unit  that  includes  piston  (phase-locking)  and 
tip-tilt  phase  control  subsystems. 

In  the  MOPA  system,  the  light  from  a  narrow-linewidth 
(~5  kHz)  fiber  laser  with  wavelength  A  =  1064  nm  and 
single-mode  PM  fiber  output  is  divided  into  seven  chan¬ 
nels  using  a  fiber  splitter  with  integrated,  electrically  con¬ 
trolled  phase-shifting  elements  from  EOSPACE  [10].  The 
MOPA  system  output  fibers,  each  with  a  mode  field  dia¬ 
meter  of  7  pm,  are  connected  to  a  fiber-collimator  array 
(Optonicus  INFA  7C  [11]).  In  the  fiber  array,  the  tip  of 
each  output  fiber  is  placed  in  the  focus  of  the  correspond¬ 
ing  collimating  aspheric  lens  with  a  clear  aperture  diam¬ 
eter  of  d  =  33  mm  and  a  focal  distance  off  =  174mm. 
The  closest  center-to-center  distance  between  the  colli¬ 
mating  lenses  in  the  array  is  37  mm,  and  the  entire  fiber- 
array  aperture  is  107  mm.  The  output  fibers  are  mounted 
inside  special  fiber-positioner  devices  with  piezo¬ 
actuators  that  can  independently  displace  the  fiber  tips 
within  a  ±35  pm  range  in  two  lateral  directions  [11,12]. 
These  fiber-tip  displacements  result  in  controllable  devia¬ 
tions  of  the  propagation  directions  of  the  outgoing  beams 
anywhere  within  a  ±0.2  mrad  solid  angle  about  the  opti¬ 
cal  axis  and  were  used  to  provide  precise  overlapping  of 
the  outgoing  beams  at  a  remote  target  (electronic  beam 
focusing)  as  well  as  precompensation  of  wavefront  phase 
tip  and  tilt  static  and  dynamic  aberrations  [4]. 

The  outgoing  beams  with  a  combined  optical  power  of 
12  mW  were  transmitted  through  a  window  located  in  the 
Intelligent  Optics  Laboratory  at  the  fifth  floor  of  the 
University  of  Dayton’s  College  Park  Center  building 
(15  m  above  ground)  and  propagated  toward  the  comer- 
cube  retroreflector  (50  mm  aperture)  located  in  a  shed  on 
the  rooftop  of  a  40  m  high  building  7  km  away.  The  la¬ 
boratory  double-glass  window  introduced  significant 
phase  aberrations  with  a  peak-to-valley  (PV)  amplitude 
of  -1.01  over  the  fiber-array  aperture  and  ~A/4  PV  over 
fiber-array  subapertures.  The  impact  of  these  quasi-static 
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aberrations  was  partially  mitigated  using  the  adaptive 
tip-tilt  control  system. 

The  optical  wave  returning  from  the  target  entered  a 
receiver  telescope  (aperture  20  cm)  located  near  the  fi¬ 
ber-array  transmitter,  as  shown  in  Fig.  1(b).  The  received 
light  was  divided  between  a  CCD  camera  and  a  photo¬ 
diode  for  telescope  pointing  (target  imaging)  and  the 
received  light  power  measurements,  respectively.  The 
photodiode  output  signal  was  used  as  the  performance 
metric,  J,  for  both  the  phase  locking  and  the  tip-tilt  con¬ 
trol  subsystems. 

The  two  parallel  operating  control  subsystems  were 
both  based  on  the  maximization  of  the  PIB  metric,  J, 
using  an  asynchronous  SPGD  control  technique  with  sig¬ 
nificantly  different  (-48  times)  iteration  rates  [13].  The 
tip-tilt  control  subsystem  with  14  control  channels  (two 
per  fiber  collimator)  utilized  a  personal  computer  with 
analog  input/output  cards  and  a  set  of  high-voltage  am¬ 
plifiers  (±70  V)  for  generation  of  the  control  voltages 
{iLjC) }  (j  =  1, ...,  14),  which  were  applied  to  the  piezo¬ 
actuators.  At  each  tip-tilt  iteration,  (n),  a  control  voltage 
update  was  performed  using  the  conventional  SPGD 
algorithm  [13] 

uf+1)  =  uf  +  rauf  [J(:]  -  JW] ,  (l) 

where  y  is  the  gain  coefficient  and  { da  "  }  is  a  set  of  14 
small-amplitude  random  control  voltage  changes,  de¬ 
noted  as  perturbations.  The  perturbations  in  the  form 

{5Ujl) }  (positive)  and  {-SUjl)}  (negative)  are  applied  be¬ 
tween  two  sequential  updates  of  the  control  voltages.  In 
Eq.  (1),  J ^  and  jW  are  the  measured  PIB  metric  values 
that  correspond  to  the  positive  and  negative  perturba¬ 
tions.  The  characteristic  time  tspgd  (SPGD  cycle  time) 
between  sequential  control  voltage  updates  is  given  by 


Fig.  1.  (Color  online)  (a)  Schematic  of  the  experimental  setup 
used  for  coherent  beam  combining  over  a  7  km  atmospheric 
propagation  path,  (b)  Photo  of  the  fiber-array  transmitter  with 
the  pointing  telescope  (right)  and  the  receiver  telescope  (left). 


^SPGD  ^(^pert  ±  ^resp  +  TJ  +  Tdelay)>  (2) 

where  Tpert  is  the  time  required  to  perturb  the  control  vol¬ 
tages,  TreSp  is  the  delay  between  a  control  voltage  change 
and  the  corresponding  optical  phase  response,  tj  is  the 
PIB  metric  measurement  time,  and  rdelay  is  the  delay  be¬ 
tween  an  induced  wavefront  phase  variation  and  the  cor¬ 
responding  metric  change.  The  last  term  in  Eq.  (2)  is  the 
double-pass  delay  rdelay  =  2 L/c  caused  by  the  optical 
wave  propagation  over  the  distance  L  with  the  speed  of 
light,  c  (in  the  experimental  setting  L  =  7  km  and  rdelay  = 
46.7^/s).  The  tip-tilt  SPGD  cycle  time,  tspgd  in  Eq.  (2),  is 
mainly  limited  by  the  time  response  of  the  piezo¬ 
actuators,  TreSp  ~  120  //s,  which  is  significantly  longer  than 
rdelay  and  ~  20  [i s.  The  resulting  tip-tilt  subsystem 
SPGD  iteration  rat e/SPGD  =  1/tspgd  wa s/SPGD  ~  3  kHz. 

The  piston  phase  control  subsystem  utilized  the  fiber- 
integrated  phase  shifters  of  the  MOPA  system,  which  have 
a  short  response  time  of  rresp  <  10  ns  so  that  the  limiting 
factor  for  increasing  the  SPGD  control  iteration  rate  is  the 
double-pass  delay  time  rdelay.  Considering  rdelay  =  46.7  ^s, 
the  piston-phase  control  SPGD  cycle  time  is  at  least 
-100  jus  and  thus/SPGD  <  10  kHz.  Note  that  the  SPGD  ± 
CU  8D  controller  from  Optonicus  used  in  the  experiments 
can  provide  much  higher  iteration  rates  (up  to  -250  kHz) 
[10].  Therefore,  the  propagation  delay  imposed  the 
limit  on  the  operational  bandwidth  of  the  conventional 
SPGD-based  piston-phase  control  subsystem  and  its  cap¬ 
ability  for  mitigation  of  atmospheric  turbulence-induced 
aberrations. 

In  order  to  overcome  this  problem,  we  utilized  in  the 
piston-phase  control  subsystem  the  recently  proposed 
delayed-SPGD  wavefront  control  algorithm,  where  the 
iterative  procedure  of  the  control  voltage  update  during 
each  iteration  cycle  ( n )  can  be  described  by  the  following 
rule  [9]: 

uf+l)  =  +  y[j[ ^  -  JM]<5*4”'Aw),  (i  =  1, 7). 

(3) 

Here  the  integer  number  An  >  0  is  the  delay  parameter 
that  accounts  for  the  double-pass  propagation  time.  In 

Eq.  (3),  An  links  the  variation  of  the  metric  SJ^  =  [J+  }  - 
J±)  ]  measured  during  iteration  ( n )  to  the  control  signal 
perturbations  which  caused  the  metric 

change.  The  delay  parameter  can  be  calculated  as  the 
closest  integer  number  to  the  ratio  rdelay/rSPGD.  With 
the  SPGD  cycle  time  rSPGD  =  7.0^/sec  (iteration  rate 
f  spgd  ~  143  kHz)  and  rdelay  =  46.7  ^s,  we  obtain  An  =  7. 

During  the  experiments,  the  fiber-collimator  array  con¬ 
trol  system  was  repeating  50  sequences  of  5.25  s  long 
trials  comprising  three  operational  states  of  1.75  s  each. 
These  stages  are  indicated  in  Fig.  2  as  “feedback  off’  (all 
control  loops  were  off),  “piston  control  on,”  and  “piston 
and  tip-tilt  control  on.”  In  the  “piston  control  on”  state, 
the  piston-phase  (phase-locking)  control  system  was 
turned  on.  During  the  last  state,  both  the  piston  and 
tip-tilt  control  subsystems  were  switched  on.  Values  for 
the  PIB  metric,  J,  were  recorded  for  all  50  trials  by  the 
supervising  controller  at  a  rate  of  about  10  k  samples/s. 
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Fig.  2.  (Color  online)  Experimental  results  from  the  coherent 
beam  combining  experiment:  (a)  average  PIB  metric  evolution 
curve,  (J)  and  (b),  (c)  averaged  irradiance  distribution  at  the 
target  plane  with  feedback  off  (b)  and  piston  control  on  (c). 

As  shown  in  Fig.  1(a),  the  retroreflector  at  the  target 
plane  was  mounted  behind  a  hole  in  a  cardboard  screen 
and  a  small  piece  of  retroreflecting  tape  (~6  mm  dia¬ 
meter)  was  glued  onto  the  center  of  the  retroreflector’s 
cover  glass.  An  near-IR  camera  with  a  wide-angle  objec¬ 
tive  was  placed  about  1  m  in  front  and  20  cm  to  the  side  of 
the  retroreflector  and  used  to  image  the  irradiance  pat¬ 
tern  (beam  footprint)  on  the  screen  and  the  retrotape. 

Figure  2(a)  shows  the  time  dependence  of  the  trial- 
averaged  PIB  metric  (J)  for  two  different  settings  of 
the  piston-phase  controller:  the  first  curve  (red,  lower) 
corresponds  to  the  conventional  SPGD  algorithm 
(1)  and  the  second  curve  (blue,  upper)  to  the  delayed  al¬ 
gorithm  (3).  In  comparison  to  the  open  loop  state,  the 
average  PIB  metric,  (J),  increased  3.7-fold  for  the  con¬ 
ventional  and  5.6-fold  for  the  delayed-SPGD  control. 

Recorded  target-plane  beam  footprints  (averages  of 
270  frames)  for  the  cases  with  feedback  off  and  piston- 
phase  control  on  can  be  seen  in  Figs.  2(b)  and  2(c), 
respectively.  The  dark  annular  region  in  the  center  cor¬ 
responds  to  the  circular  opening  for  the  retroreflector 
with  the  retrotape  spot  in  the  center.  A  comparison  of 
these  two  images  demonstrates  the  higher  concentration 
of  the  beam  energy  at  the  retroreflector  when  phase  con¬ 
trol  is  on  and  proves  that  the  PIB  metric  maximization 
locks  the  beam  phases  at  the  target  plane. 

The  experimental  results  in  Fig.  2(a)  correspond  to  at¬ 
mospheric  turbulence  conditions  characterized  by  a 
path-averaged  refractive  index  structure  parameter  C\  = 
6  x  10-16  nr2/3  (measured  by  a  Scintec  BLS2000  scintill¬ 
ometer  [14])  and  a  normalized  standard  deviation  of 
metric  fluctuations  oj/(J)=  0.92  (open  loop).  Piston 


control  resulted  not  only  in  the  increase  of  the  average 
metric  value,  but  also  led  to  a  decrease  in  the  metric 
fluctuation  level  down  to  oj/ (J)  =  0.52  for  the  conven¬ 
tional  and  to  0.42  for  the  delayed  SPGD  controllers. 

Note  that  the  tip-tilt  control  subsystem,  which  was 
turned  on  during  the  last  state  of  the  adaptation  trials, 
did  not  result  in  a  further  metric  increase  (and  caused 
only  a  slight  change  in  metric  fluctuations  due  to  the 
tip-tilt  perturbations).  This  can  be  explained  by  taking 
into  account  the  48-fold  faster  updates  of  the  piston- 
phase  control  system,  which  can  provide  a  partial  mitiga¬ 
tion  of  overall  wavefront  phase  tip  and  tilt  aberrations 
using  a  stepwise  (piston)  approximation  prior  to  a  reac¬ 
tion  of  the  tip-tilt  subsystem.  However,  our  experiments 
showed  that  efficient  coherent  combining  with  piston- 
phase  control  was  only  possible  if  the  transmitted  beams 
overlap  well  at  the  target,  which  was  achieved  by  turning 
on  the  tip-tilt  control  subsystem  for  a  few  seconds  in  ad¬ 
dition  to  piston  control.  In  the  experiments  described 
above,  tip-tilt  control  voltages  were  fixed  at  the  end 
of  each  adaptation  trial  and  provided  sufficient  overlap¬ 
ping  during  the  piston  control  stage  of  the  next  adapta¬ 
tion  cycle.  Without  a  tip-tilt  control  phase  in  each  trial, 
we  observed  a  slow  (on  the  order  of  100-200  s)  decline  in 
coherent  beam  combining  efficiency,  indicating  that  sta¬ 
tic  tip-tilt  control  voltages  do  not  maintain  efficient  over¬ 
lapping  of  the  outgoing  beams  at  the  target  over  a  longer 
time  period,  mostly  due  to  thermal  expansion-induced 
system  misalignments. 

This  work  was  performed  in  the  frame  of  collaborative 
agreement  W911NF-09-2-0040  between  the  United  States 
Army  Research  Laboratory  and  the  University  of  Dayton. 
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Atmospheric  turbulence  is  a  serious  problem  for  satellite  and  aircraft-to-ground  based  classical 
imaging.  Taking  advantage  of  the  natural,  nonfactorizable,  point-to-point  correlation  of  thermal 
light,  this  experiment  demonstrated  turbulence-free  ghost  imaging,  which  will  be  extremely  useful 
for  these  applications.  In  addition,  this  observation  suggests  that  the  nontrivial  intensity-intensity 
correlation  of  thermal  light  cannot  be  caused  by  the  statistical  correlation  of  intensity  fluctuations. 
©  2011  American  Institute  of  Physics,  [doi:  10. 1063/1. 3567931] 


One  of  the  most  surprising  consequences  of  quantum 
mechanics  is  the  nonlocal  correlation  of  a  multi-particle  sys¬ 
tem  measured  by  joint-detection  of  distant  particle  detectors. 
Ghost  imaging  is  one  such  phenomena.  Recently  Meyers  et 
al.1,2  performed  ghost  imaging  of  remote  objects  by  measur¬ 
ing  reflected  photons,  thereby  making  ghost  imaging  practi¬ 
cal  for  applications.  Two  types  of  ghost  imaging  have  been 
experimentally  demonstrated  since  1995.  One  type  of  ghost 
imaging  uses  entangled  photon  pairs  as  the  light  source3  and 
another  type  of  ghost  imaging  uses  chaotic  thermal  light.1,4 
The  nonlocal  multiphoton  interference  nature  of  ghost  imag¬ 
ing  determines  its  peculiar  features:  (1)  it  is  nonlocal;  (2)  its 
imaging  resolution  differs  from  that  of  classical. 

Recently  there  has  been  increased  interest  in  ghost  im¬ 
aging  through  turbulence  and  related  index  of  refraction  dis¬ 
tortions  as  demonstrated  by  theoretical5  and  experimental 
papers.2,6  However,  thermal  light  ghost  imaging  through  tur¬ 
bulence  has  not  been  previously  reported.  In  this  letter,  we 
wish  to  report  a  recent  ghost  imaging  experiment  with  ther¬ 
mal  light  which  demonstrated  another  peculiar  yet  useful 
feature  of  ghost  imaging:  “turbulence-free,”  i.e.,  any  index  of 
refraction  fluctuation  of  turbulence  in  the  optical  path  will 
not  affect  the  quality  of  the  ghost  image.  This  important  fea¬ 
ture  will  be  useful  for  applications  like  satellite  and  aircraft- 
to-ground  based  distant  imaging,  for  which  atmospheric  tur¬ 
bulence  is  a  serious  problem.  We  present  the  main  result 
from  the  ghost  imaging  experiment  performed  at  the  US 
Army  Research  Laboratory  (ARL),  namely  that  the  ghost 
image  is  virtually  free  of  the  adverse  effects  of  turbulence. 
We  also  highlight  the  two-photon  interference  nature  of  the 
ghost  imaging  as  the  primary  cause  of  this  turbulence-free 
effect.  We  expand  on  the  effects  of  turbulence  on  ghost  im¬ 
aging  in  another  letter.7 

A  schematic  of  the  experimental  setup  is  shown  in  Fig. 
1.  It  is  a  typical  thermal  light  lensless  ghost  imaging  setup,1 
except  for  the  addition  of  heating  elements  to  produce  labo¬ 
ratory  atmospheric  turbulence.  It  uses  secondary  ghost  imag¬ 
ing  which  helps  make  ghost  imaging  practical  for  applica¬ 
tions.  In  this  experiment,  turbulence  is  introduced  by  adding 
heating  elements  at  550  °C  underneath  any  or  ah  optical 
paths  as  illustrated  in  Fig.  1.  Heating  of  the  air  causes  tem¬ 
poral  and  spatial  fluctuations  on  its  index  of  refraction  that 
makes  the  classical  image  of  the  object  jitter  about  randomly 
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on  the  image  plane  causing  a  “blurred”  picture.  As  in  our 
early  experiment,1  the  light  source  is  a  typical  chaotic 
pseudo-thermal  source,  which  contains  a  laser  beam  and  a 
fast  rotating  ground  glass  diffuser.  The  chaotically  scattered 
laser  beam,  with  a  fairly  large  size  (11  mm  diameter)  in 
transverse  dimension,  is  split  into  two  by  a  50%-50%  beam¬ 
splitter.  One  of  the  beams  illuminates  an  object  located  at  z\, 
such  as  the  letters  “ARL”  shown  in  Fig.  1.  The  photons 
scattered  and  reflected  from  the  object  are  collected  and 
counted  by  a  “bucket”  detector,  which  is  simulated  by  the 
right-half  of  the  charged  coupled  device  (CCD)  in  Fig.  1. 
The  other  beam  propagates  to  the  ghost  image  plane  of  z2 
=Z\  —  1.4  m  and  the  path  from  the  target  to  the  detectors 
over  heating  elements  is  —1.7  m.  Like  our  early  demonstra¬ 
tion  of  ghost  imaging,  placing  a  CCD  array  on  the  ghost 
image  plane,  allows  it  to  capture  the  ghost  image  of  the 
object  if  its  exposure  is  gated  by  the  bucket  detector.1  Our 
CCD  array  imaged  the  target  and  reference  planes  located  on 
a  sheet  of  paper  where  one  half  is  glossy  white  and  the  other 
half  contains  the  target.  The  scattered  and  reflected  light 
from  the  glossy  white  half  of  the  paper,  which  contains  the 
reference  spatial  information  for  the  ghost  image,  is  then 
captured  by  the  left-half  of  the  high  resolution  CCD  camera 
operating  in  the  photon  counting  regime.  The  CCD  camera  is 
focused  onto  the  ghost  image  plane  and  is  gated  by  the 
bucket  detector  for  the  generation  of  the  secondary  ghost 


FIG.  1.  (Color)  (a)  Schematic  setup  of  a  typical  lensless  ghost  imaging 
experiment  with  thermal  light  in  which  significant  turbulence  is  introduced 
in  its  optical  paths.  Dashed  arrows  indicate  the  optical  path  to  the  “bucket” 
and  solid  arrows  indicate  the  optical  path  of  the  reference  image,  (b)  The 
inset  depicts  typical  turbulence  structures  measured  during  the  experiment. 


(a)  (b)  (c)  (d)  (e) 


FIG.  2.  (Color  online)  From  left  to  right:  (a)  a  ghost  image  of  the  letter  “A,” 
(b)  an  averaged  image  of  the  letter  “A”  followed  by  three  classical  image 
“shots”  of  letter  “A”  with  turbulence  (c)-(e).  These  images  were  taken 
through  the  bucket  detector  arm.  Notice  the  turbulence  induced  jittering  and 
distortion  of  the  images  from  one  “shot”  to  another  “shot.”  Both  the  ghost 
image  and  the  averaged  classical  image  came  from  the  same  turbulence.  The 
ghost  image  is  not  adversely  effected  by  the  turbulence  while  the  average  of 
the  classical  images  shows  significant  degradation. 

image.  Analogously,  Valencia  et  al. 4  focused  light  from  the 
source  onto  the  imaging  plane  to  be  measured  with  a  scan¬ 
ning  fiber  tip.  In  our  special  setup  each  half  of  the  CCD 
camera  can  play  the  role  of  an  independent  classical  camera 
in  its  “normal”  ungated  operation.  The  hardware  circuit  and 
the  software  program  monitor  the  outputs  of  the  left-half  and 
the  right-half  of  the  CCD  individually,  as  two  independent 
classical  cameras,  and  simultaneously  to  monitor  the  gated 
output  of  the  left-half  of  the  CCD  as  a  ghost  camera.  The 
classical  image  and  the  secondary  ghost  image  of  the  object 
were  captured  and  monitored  simultaneously  when  turbu¬ 
lence  was  introduced  to  any  or  to  ah  of  the  optical  paths. 

The  effect  of  turbulence  on  a  classical  image  is  easily 
seen  in  Figs.  2(c)-2(e),  in  which  three  sequential  classical 
image  “shots”  of  the  letter  “A”  were  taken  by  the  right-half 
of  the  CCD  in  its  normal  operation  as  a  classical  camera. 
Integrating  a  number  (10  000)  of  sequential  images  with  an 
exposure  time  of  1  ms  results  in  a  blurred  image,  Fig.  2(b). 
Turbulence  is  a  result  of  strong  stochastic  space  and  time 
variations  in  the  fluid  properties  such  as  velocity  components 
u^xj)  and  index  of  refraction  y(x,t).  Our  experiments  used 
realtime  imaging,  Fig.  1(b),  to  extract  properties  for  objec¬ 
tive  characterization  of  the  anisotropic  inhomogeneous 
turbulence.8  Velocity  probability  density  functions,  velocity 

correlations,  (ui(xl9ti)uj(x29t2))9  and  their  time  and  space 

8  ^ 

scales  were  computed.  The  turbulence  velocity  correlation 
space  scales  were  found  to  be  1-2.5  mm  and  the  turbulence 
velocity  correlation  time  scales  were  2.5-5  ms.  While  each 
beam  distorts  and  spreads  due  to  turbulence,  the  point-to- 
point  correlation  between  the  image  plane  and  the  object 
plane  is  maintained.  The  small  space  turbulence  scales  indi¬ 
cate  that  the  reference  and  bucket  beams  experience  indepen¬ 
dent  turbulence  deviations,  and  the  small  time  correlation 
scales  indicate  that  the  image  frames  at  different  times  also 
experience  different  turbulence  realizations.  Optical  turbu¬ 
lence  is  the  variability  of  light  propagating  through  rj(x,t) 
fluctuations,  and  is  often  characterized  by  the  structure 
parameter  ([^(x)-  7}(x+r)]2)  =  C„Jr  where  T  is  a  scaling 
function  that  is  often  set  to  r2/3.  C2  is  a  standard  means  of 
characterizing  both  laboratory  and  atmospheric  optical  turbu¬ 
lence  and  has  dimensions  of  length-273,  rendering  the  struc¬ 
ture  function  dimensionless.  Using  standard  methods9  we  de¬ 
termined  that  the  images  in  Fig.  2  were  taken  in  high 
turbulence  with  C2=  1.5  X  10“ 12 . 

This  experiment  demonstrates  that  neither  the  spatial 
resolution  nor  contrast  of  the  ghost  image  were  affected  sig¬ 
nificantly  by  the  turbulence  present  in  the  bucket  and  refer¬ 
ence  optical  paths.  Figures  2(a)  and  2(b)  compare  a  ghost 
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FIG.  3.  (Color  online)  A  ghost  image  of  “ARL”  without  turbulence  (a)  is 
compared  with  a  ghost  image  of  “ARL”  with  significant  turbulence  (b).  In 
this  measurement  turbulence  was  introduced  into  all  optical  paths  as  shown 
in  Fig.  1.  There  is  virtually  no  difference  when  turbulence  is  introduced. 

image  (a)  with  a  simple  average  image  (b)  under  the  same 
turbulence  conditions  as  in  Figs.  2(c)-2(e).  It  is  clear  that  the 
ghost  imaging  resolution  surpasses  the  resolution  of  the 
“average  image”  in  turbulence.  In  Fig.  3,  a  ghost  image  of 
“ARL”  with  the  same  high  turbulence  described  above  is 
compared  with  another  ghost  image  of  “ARL”  without  tur¬ 
bulence.  It  is  difficult  to  see  the  difference.  Visibilities  in 
Figs.  3(a)  and  3(b)  after  normalization  were  75%  for  the 
nonturbulence  case  and  33%  for  the  turbulence  case.  Sub¬ 
tracting  out  the  nearly  constant  backgrounds  in  both  the  tur¬ 
bulence  and  nonturbulence  cases,  yielded  nearly  100%  vis¬ 
ibility.  The  quality  of  the  ghost  image  is  virtually  unaffected 
by  turbulence  even  though  the  turbulence  acts  to  scatter  the 
energy  along  the  optical  paths  to  the  target  and  detection 
planes.  Further  experiments  were  performed  with  exposure 
times  as  short  as  1  juls  and  yielded  results  similar  to  those 
presented  above.  Resolution  was  quantified  between  Figs. 
2(a)  and  2(b)  by  applying  a  Gaussian  point  spread  function 
(PSF)  to  the  initial  “A”  target  to  approximate  Figs.  2(a)  and 
2(b),  respectively.  PSF  standard  deviations  in  each  dimen¬ 
sion  were  1.6  pixels  to  match  the  ghost  image  of  Fig.  2(a) 
and  3.2  pixels  to  approximate  the  classical  average  of  Fig. 
2(b),  which  still  had  unaccounted  for  aberrations.  In  sum¬ 
mary,  the  ghost  image  appears  relatively  undistorted  by  tur¬ 
bulence,  i.e.,  turbulence-free. 

The  quantum  theory  of  photodetection10  gives  a  reason¬ 
able  interpretation  to  the  turbulence-free  ghost  imaging  of 
thermal  light.  In  Glauber’s  theory,  a  joint  detection  of  two 
independent  point  photodetectors  measures  the  probability  of 
observing  a  joint-detection  event  of  two  photons  at  space- 
time  points  (r1?^)  and  (r2,f2): 

G^2\rl,ti\r2,t2)  =  «M_)4_)4+)M+ (!) 

where  E(j±\rj,tj),j=l,2  are  the  negative  and  positive  field 
operators  at  (r1?^)  and  (r2,r2).  We  have  proven  that  the 
quantum  expectation  is  the  result  of  a  superposition1 

((E{{)E(2)Ei:f)E(f))QM)Ensemble  =  \g2(p2,z2 ;  k)#i(pi,Zi  ;  *') 

+  g2(p2^2;K')gi(pi,Zi-,i<)\2,  (2) 

where  gj(pj , Zj ;  k)  is  the  Green’s  function,  which  propagates 
the  held  from  the  source  to  the  jth  photodetector  in  the  im¬ 
age  and  object  arms.11,12  Equation  (2)  indicates  an  interfer¬ 
ence  between  two  quantum  amplitudes,  corresponding  to  two 
alternatives,  different  yet  indistinguishable,  which  lead  to  a 
joint  photodetection  event.  This  interference  involves  both 
arms  of  the  optical  setup  as  well  as  two  distant  photodetec¬ 
tion  events  at  (p\9Z\)  and  (p2,z2),  respectively.  Figure  4 
schematically  illustrates  the  two  alternatives  for  a  pair  of 
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FIG.  4.  (Color)  Schematic  illustration  of  two-photon  inter¬ 
ference:  a  superposition  between  g2{p2,Z2\^)g\{p\,Z\\i<') 

and  g2(P2,z2\i<,)8i(pi,ZuK).  The  two-photon  amplitudes 
g2(p2,z2;K)gi(pi,z1-,K')  and  g2(p2,z2‘,/<,)gi(pi,Zi;K)  will  superpose  con¬ 
structively  at  p\  —  p2  and  Z\—Z2- 

modes  k  and  k'  to  produce  a  joint  photodetection  event. 

Now,  we  introduce  an  arbitrary  phase  disturbance 
into  the  image  arm  and  another  phase  disturbance  el(p into 
the  object  arm  to  simulate  the  turbulence,  where  A <pi(pi)  and 
A(p2(p2)  add  random  phases  onto  the  radiation  at  transverse 
coordinates  p{  and  p2 >  respectively.  The  quantum  expectation 
is  thus 

\g2(p2,Z2\i<)e,^niP2)gi{pi,Zi-,i<')e,A‘Pl(-i!l) 

+  giiPi^Z’y^^giiPuZr ;  i<)eiA^\2 

=  |£2(P2>Z2;k)£i(Pi>Zi;k') 

+  g2(P2,Z2-,K')gl(puZi-,i<)\2.  (3) 

Notice  that  the  phase  disturbances  introduced  by  the  turbu¬ 
lence  have  a  null  effect  on  the  second-order  correlation  func¬ 
tion  G^\pi , Z\ ;  p2 , z2)  of  Eq.  (1).  The  normalized  nonfactor- 
izable  point-to-point  image-forming  correlation  g^(Pi;p2) 
of  thermal  light  is  thus  turbulence-free.  The  two-photon  sym¬ 
metric  wave  function  conditions  established  in  the  theory 
show  the  aberration  cancelling  effect  of  turbulence-free 
ghost  imaging. 

As  proven  earlier1  and  demonstrated  here  in  turbulence, 
the  image  forming  correlation  g^(Pi;p2)  of  thermal  light  is 
a  nonfactorizable  point-to-point  intensity-intensity  correla¬ 
tion  that  comes  about  from  quantum  superposition  of  two- 
photon  amplitudes  instead  of  classical  correlation  of  intensity 
fluctuations.  See  Agarwal  et  alP  on  nonclassical  interfer¬ 
ence.  Significantly,  the  nonlocal  nonfactorizable  property  of 
thermal  light  that  we  demonstrated  could  be  useful  as  a  po¬ 
tential  resource  for  a  quantum  information  processing  ther¬ 
mal  qubit. 


By  comparison,  a  classical  simulation  of  ghost  imaging 
was  proposed  by  Gatti  et  al.P  in  which  two  classical  imag¬ 
ing  systems  are  used  to  image  the  speckles  of  the  light  source 
onto  the  object  plane  and  the  image  plane,  respectively,  to 
form  a  trivial  speckle- to- speckle  correlation.  Under  turbu¬ 
lence,  the  object  and  image  plane  speckles  would  be  blurred 
in  a  random  manner  as  would  the  factorizable  speckle-to- 
speckle  correlation. 

In  conclusion,  we  have  demonstrated  the  peculiar 
turbulence-free  feature  of  ghost  imaging  with  thermal  light, 
which  can  be  extremely  useful  for  applications  such  as  dis¬ 
tant  imaging.  The  turbulence-free  ghost  imaging  phenomena 
is  the  result  of  nonlocal  two-photon  interference  which 
cannot  be  simulated  classically  by  factorizable  intensity- 
intensity  correlations. 
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rate  100-2000  GHz  superconducting  camera  for  standoff  personnel  imaging  applications,  along  with  the  first  quasi-optical 
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automatic  target  recognition,  cross-modal  face  recognition,  activity  recognition,  and  image  enhancement.  He  received  an  Army 
R&D  award  in  2012,  and  has  over  30  publications  in  image  processing  and  fMRI  analysis. 

John  W.  Little  received  his  B.S.  degree  in  Physics  from  Tennessee  Technological  University  in  1976  and  his  Ph.D.  degree 
in  Physics  from  the  University  of  Tennessee,  Knoxville  in  1982.  After  a  year  as  a  postdoctoral  fellow  at  Oak  Ridge  National 
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Laboratory  (ARL),  Adelphi,  MD.  He  has  experience  in  a  wide  range  of  optoelectronic  and  RF  device  physics  and  fabrication.  He 
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2005-2007  he  undertook  studies  at  the  University  of  Amsterdam  Astronomy  Department,  where  he  developed  and  optimized 
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Dr.  Wijewarnasuriya  was  a  member  of  the  team  dedicated  to  the  demonstration  of  novel,  large-format  infrared  focal  plane 
arrays  for  tactical  and  strategic  military  applications,  as  well  as  astronomy.  Dr.  Wijewarnasuriya  has  authored  or  coauthored 
over  73  papers  in  the  open  technical  literature,  two  book  chapters  and  has  presented  his  work  at  numerous  national  and 
international  conferences. 

David  A.  Wikner  is  the  Millimeter-Wave  Sensor  Technology  Team  Leader  in  the  Sensors  and  Electronic  Devices  Directorate 
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